KantoDB developer guide

Welcome to the KantoDB developer guide.

This document is intended to give a competent software developer familiar with the topics of backend software, storage, data structures and Rust a better idea of how KantoDB works internally.

The Open Source source code for KantoDB is available at https://git.kantodb.com/kantodb/kantodb.

Keeping documentation up to date is a difficult task, doubly so for developer-oriented documentation. Statements made in this document may in fact be false. Projections of what work will be done in the future, and how things might work one day, are even more hazardous.

If anything here seems out of touch with reality, please reach out on the community forums.

Roadmap

As of 2025Q1, the content and order of goals are very much in flux.

Your input is very much welcome, please engage us in a support contract or on the community forums.

About this page

Tags used in the roadmap
  • #external: Likely source of the bug is not us, but in a dependency
  • #maybe: We might not do this
  • #urgency/high
  • #urgency/medium
  • #urgency/low
  • #severity/high
  • #severity/medium
  • #severity/low
  • #size/small
  • #size/medium
  • #size/big
  • #api-change: Changes already-published API
  • #bug: Likely bug in the software (not documentation)
  • #enhancement: New feature or improving behavior some other way
  • #performance: Improves performance, minimal effect on other behaviors
  • #ops: Affects operational aspects, not functionality; for example, observability
  • #dev: Only affects development and building from source, not the final product
  • #test: Topic where test coverage needs to be better
  • #doc: Improvements or additions to documentation
  • #good-first-issue: Good for newcomers
  • #need-more-info: We don't have know enough yet to work on this
  • #help-wanted: Extra attention would be appreciated
  • #ecosystem/*: We are hoping our dependencies will improve
  • #waiting: Cannot proceed at this time

The checklist items are in very rough priority order. Order will be adjusted based on feedback and discovery of critical paths.

The big picture

  • v0: Bootstrap to a real SQL database
  • v1: Production-ready database using RocksDB
  • v2: Pure Rust with new storage subsystem

Platforms

Support for non-Linux operating systems is not currently on the roadmap. We intend to actively use Linux-specific functionality for performance, security and simplicity.

For now, only x86-64 is actively tested, but everything is supposed to work cross-architecture (higher risk: 32-bit processing >4 GiB of data at once, 32-bit in general, big-endian).

Pending work

Work that is more or less "about to be done".

Unimplemented or buggy SQL operations

  • DataFusion explicitly refuses to work with many SQL types even when it has the corresponding underlying type; see https://docs.rs/datafusion/latest/datafusion/sql/planner/struct.SqlToRel.html#method.build_schema and convert_data_type, e.g. UInt64 vs UNSIGNED BIGINT
  • large number literals behave differently than in Postgres; SELECT i64_max_plus_1 gives float, Postgres looks to be using big numbers
  • CREATE UNIQUE INDEX .. [NULLS DISTINCT] with null columns (#ref/unimplemented_sql/w9j8q485anior)
    • use a multi-index with nullable columns in value not key, prefix scan on insert to detect conflict
  • UPDATE of column in primary key (#ref/unimplemented_sql/fpb157cronqcy)
  • tables without primary keys are not usable
    • Datafusion UPDATE semantics are tricky, we need to dissect the input LogicalPlan and convert to table scan of the current data (instead of blindly executing it), update records as we see them in the scan
    • #bug DELETE (#ref/gs5muy13tu1uk) or UPDATE (#ref/i61jpodrazscs) for a table using row_id (not primary keys)
    • #bug DELETE or UPDATE for a table using row_id (not primary keys) #ref/gs5muy13tu1uk #ref/i61jpodrazscs
    • possible fix: adjust logical_plan to also get a magic-named column __kanto_row_key, use that to do deletes
    • possible fix: understand the logical_plan, don't use DataFusion to produce record batches, run a custom table/filter scan that also deletes on the fly
  • return number of rows affected
  • index existing data at CREATE INDEX time #severity/high #urgency/high
    • ideally go straight to concurrent index building logic, no table locks
  • INSERT RETURNING (#ref/hixyde6h9n77o), UPDATE RETURNING (#ref/pp5wz7nq4yhzs), DELETE RETURNING (#ref/yn84qzsk1f9ny) refused by DataFusion SQL parser, not sure how to work around #ecosystem/datafusion #severity/high #urgency/medium
  • CTEs (test coverage): SELECT, UPDATE, INSERT, DELETE inside a CTE, SELECT, UPDATE, INSERT, DELETE following a CTE
  • rest of the data types
    • should we enforce VARCHAR(30) etc length, truncate; forbidding that syntax until clarified
    • INTEGER(10) etc precision
      • type { INTEGER(precision) | INT(precision) | INT4(precision) } (#ref/unimplemented_sql/tiptbb5d8buxa)
      • type { UNSIGNED INTEGER(precision) | UNSIGNED INT(precision) | UNSIGNED INT4(precision) } (#ref/unimplemented_sql/fzt1q8mfgkr6n)
      • type { (UNSIGNED) INT2(precision) | (UNSIGNED) SMALLINT(precision) } (#ref/unimplemented_sql/g57gd8499t4hg)
      • type { BIGINT(precision) | INT8(precision) } (#ref/unimplemented_sql/88fzntq3mz14g)
      • type { UNSIGNED BIGINT(precision) | UNSIGNED INT8(precision) } (#ref/unimplemented_sql/sze33j7qzt91s)
      • type TINYINT(precision) (#ref/unimplemented_sql/hueii8mjgk8qq)
      • type UNSIGNED TINYINT(precision) (#ref/unimplemented_sql/ewt6tneh7ecps)
    • type { NUMERIC | DECIMAL | DEC } (#ref/unimplemented_sql/uoakmzwax448h)
    • FLOAT(42) etc precision
      • type FLOAT(precision) (#ref/unimplemented_sql/xfwbr486jaw5q)
      • type DOUBLE(precision) (#ref/unimplemented_sql/wo43sei9ubpon)
    • NUMERIC(precision, scale), NUMERIC(precision), NUMERIC; aka DECIMAL(precision, scale)
    • type { CHARACTER | CHAR | FIXEDSTRING } (#ref/unimplemented_sql/hdhf8ygp9zna4): fixed-length strings
    • type INT128 (ClickHouse syntax) (#ref/unimplemented_sql/4sg3mftgxu9j4)
    • type INT256 (ClickHouse syntax) (#ref/unimplemented_sql/ex9mnxxipbqqs)
    • type UINT128 (ClickHouse syntax) (#ref/unimplemented_sql/riob7j6jahpte)
    • type UINT256 (ClickHouse syntax) (#ref/unimplemented_sql/q6tzjzd7xf3ga)
    • type { MEDIUMINT (UNSIGNED) } (#ref/unimplemented_sql/m7snzsrqj68sw)
    • type { BIGNUMERIC | BIGDECIMAL } (BigQuery syntax) (#ref/unimplemented_sql/an57g4d4uzwew)
    • type BIT (VARYING) (#ref/unimplemented_sql/88uneott9owc1)
    • type UUID (#ref/unimplemented_sql/531ax7gb73ce1)
    • type TIMESTAMP (#ref/unimplemented_sql/njad51sbxzwnn)
    • type TIME (#ref/unimplemented_sql/bppzhg7ck7xhr)
    • type DATE (#ref/unimplemented_sql/1r5ez1z8j7ryo)
    • type DATE32 (#ref/unimplemented_sql/781y5fxih7pnn)
    • type DATETIME (ClickHouse syntax) (#ref/unimplemented_sql/oak3s5g3rnutq)
    • type DATETIME (MySQL syntax) (#ref/unimplemented_sql/dojz85ngwo5wr)
    • type INTERVAL (#ref/unimplemented_sql/5wxynwyfua69s)
    • type { JSON | JSONB } (#ref/unimplemented_sql/abcw878jr35qc)
    • type BINARY (#ref/unimplemented_sql/i16wqwmie17eg)
    • type ENUM (#ref/unimplemented_sql/1geqh5qxdeoko)
    • type ARRAY (#ref/unimplemented_sql/hnmqj3qbtutn4)
    • type SET (#ref/unimplemented_sql/hm53sq3gco3sw)
    • type STRUCT (#ref/unimplemented_sql/nsapujxjjk9hg)
    • type MAP (#ref/unimplemented_sql/7ts8zgnafnp7g)
    • type UNION (#ref/unimplemented_sql/oownfhnfj5b9r)
    • type TUPLE (#ref/unimplemented_sql/x7qwkwdhdznxe)
    • type NESTED (#ref/unimplemented_sql/fa6nofmo1s49g)
    • type { CHARACTER LARGE OBJECT | CHAR LARGE OBJECT | CLOB } (#ref/unimplemented_sql/ew7uhufhkzj9w)
    • type TRIGGER (#ref/unimplemented_sql/zf5jx1ykoc5s1)
    • custom type (#ref/unimplemented_sql/dt315zutjqpzc)
    • type UNSPECIFIED (#ref/unimplemented_sql/916n48obgiqh6)
    • type LOWCARDINALITY (#ref/unimplemented_sql/78exey8xjz3yk)
    • type REGCLASS (Postgres syntax) (#ref/unimplemented_sql/quzjzcsusttgc)
    • type NULLABLE (ClickHouse syntax) (#ref/unimplemented_sql/ra5kspnstauu6)
  • emulate more Postgres system tables, like pg_cancel_backend

Unimplemented SQL syntax support

See also DataFusion SQL support status: https://arrow.apache.org/datafusion/user-guide/sql/sql_status.html, roadmap https://arrow.apache.org/datafusion/contributor-guide/roadmap.html.

For reference, see Postgres SQL, SQLite SQL, and MariaDB SQL.

  • SET LOCAL (#ref/unimplemented_sql/7b1ab6tms3uz4)
  • SHOW variable (#ref/unimplemented_sql/7a69iu5nr1oxk)
  • CREATE TABLE IF NOT EXISTS (#ref/unimplemented_sql/pf1mfhd9sz3jn)
  • DROP TABLE (#ref/unimplemented_sql/47hnz51gohsx6)
  • ALTER TABLE (#ref/unimplemented_sql/dny7j9hx3ihha)
  • CREATE INDEX IF NOT EXISTS (#ref/unimplemented_sql/tzwc96kk9173n)
  • DROP INDEX (#ref/unimplemented_sql/4kgkg4jhqhfrw)
  • ALTER INDEX (#ref/unimplemented_sql/ha1fybdqqwa8k)
  • SET TRANSACTION (#ref/unimplemented_sql/yybqytw7q8bmn)
    • may cause a bigger refactoring of our transaction start; we currently do eager creation
  • CREATE TABLE without columns (#ref/unimplemented_sql/g7hdmcm5bk69a)
  • constraints on tables #severity/high #urgency/high
    • CREATE TABLE with table constraint CHECK (#ref/unimplemented_sql/8zaqnh9mb9tor)
  • constraints on columns
    • CREATE TABLE with column option { PRIMARY KEY | UNIQUE } (#ref/unimplemented_sql/hwcjakao83bue) #severity/high #urgency/medium
    • CREATE TABLE with column option CHECK (#ref/unimplemented_sql/dwcpda8aj66d6) #severity/high #urgency/medium
  • default and computed values for columns
  • CREATE INDEX with dotted column name (#ref/unimplemented_sql/kbezqzh7p6yky)
  • CREATE INDEX with non-column expression (#ref/unimplemented_sql/8cjqfzsfpnduw)
  • CREATE INDEX WHERE (#ref/unimplemented_sql/9134bk3fe98x6): partial indexes
  • CREATE INDEX .. WITH (#ref/unimplemented_sql/w7t6c8xrehsnh)
  • CREATE INDEX .. INCLUDE (#ref/unimplemented_sql/ip415b5s8sa6h): non-key included columns
  • foreign keys #severity/high #urgency/high
    • CREATE TABLE with table constraint FOREIGN KEY (#ref/unimplemented_sql/mamfde4rdzgeo)
    • CREATE TABLE with column option REFERENCES (#ref/unimplemented_sql/ufnoiinkkp3wy)
    • enforce
  • roles and access control #severity/high #urgency/medium
    • CREATE ROLE (#ref/unimplemented_sql/hqyj3srdk1g4h)
    • DROP ROLE (#ref/unimplemented_sql/q8wh8nqixoytq)
    • ALTER ROLE (#ref/unimplemented_sql/u6set5k1gzyhh)
    • GRANT (#ref/unimplemented_sql/mekqqm6hxy64s)
    • REVOKE (#ref/unimplemented_sql/wha7884jy4bty)
    • SET ROLE (#ref/unimplemented_sql/ryqxdr55r49sr)
  • prepared statements
    • PREPARE (#ref/unimplemented_sql/fgi986yc4d7cw)
    • EXECUTE (#ref/unimplemented_sql/3w91mfrbzguyh)
    • DEALLOCATE (#ref/unimplemented_sql/fn1q45ys1t8ks)
  • views, first dynamic, then materialized, then incremental updates etc extras #severity/high
    • CREATE VIEW (#ref/unimplemented_sql/b9m5uhu9pnnsw)
    • ALTER VIEW (#ref/unimplemented_sql/uhwpi8nzojcxy)
    • DROP VIEW (#ref/unimplemented_sql/hwk7gp6ffh8zy)
  • TRUNCATE (#ref/unimplemented_sql/uf3rr4diw3o1o)
  • temporary tables
    • CREATE TEMPORARY TABLE (#ref/unimplemented_sql/3u561ykck5m76)
  • sequences (exposed to SQL)
    • CREATE SEQUENCE (#ref/unimplemented_sql/rdkjx9ryredcw)
    • DROP SEQUENCE (#ref/unimplemented_sql/ca149fq1ixm5h)
  • START TRANSACTION READ ONLY (#ref/unimplemented_sql/bhetj3emnc8n1)
  • multiple schemas
    • make schema creation explicit not implicit
    • CREATE SCHEMA (#ref/unimplemented_sql/x9cna73uf8cew)
    • DROP SCHEMA (#ref/unimplemented_sql/5rnpn33hia5a6)
  • multiple databases/catalogs/backends; all three might be the same thing for us
    • re-evaluate the whole "catalog name is backend" idea, especially when a backend can store multiple catalogs, duplication can lead to mismatch, renames are weird; maybe it's a lookup order?
      • is a backend a TABLESPACE?
        • sqlparser-rs doesn't have TABLESPACE support as of 2025-03
    • re-evaluate the idea of invisibly spreading every transaction across all backends; maybe force each transaction to stick to one, maybe make it lazy
    • Postgres docs specifically say "Databases are equivalent to catalogs"
    • CREATE DATABASE (#ref/unimplemented_sql/6c76p5eqsuea6)
    • ALTER DATABASE: not in sqlparser v0.54.0 #ecosystem/sqlparser-rs
    • DROP DATABASE (#ref/unimplemented_sql/qn6s3c397johw)
  • EXPLAIN (#ref/unimplemented_sql/fugjaua7zkfd4)
  • CREATE INDEX DESC (#ref/unimplemented_sql/n7x3zhjk7bf8k): descending sort order
  • CREATE INDEX CONCURRENTLY (#ref/unimplemented_sql/wya1a3w874xba)
  • COMMIT AND CHAIN (#ref/unimplemented_sql/ic5jpcqsjghm6)
  • ROLLBACK AND CHAIN (#ref/unimplemented_sql/q1pay866nux9n)
  • COPY (#ref/unimplemented_sql/1fxo667s3wwes): the kind that transfers data over wire; missing support in https://github.com/sunng87/pgwire #waiting
    • we do not intend to support anything where server-side file read/write gets path from user over network
  • RELEASE SAVEPOINT (#ref/unimplemented_sql/hyzhkewjpnug6): #severity/low #urgency/low #size/small
    • test error from trying to use it again
  • { LISTEN | UNLISTEN | NOTIFY } (#ref/unimplemented_sql/8t9hqpx4umzaq)
  • user-defined functions #severity/medium #urgency/low
    • CREATE FUNCTION (#ref/unimplemented_sql/fj9ny1gu79cbg)
    • DROP FUNCTION (#ref/unimplemented_sql/ptp9d3xe1456h)
    • SHOW FUNCTIONS (#ref/unimplemented_sql/yqiswgsqjc5uh): #severity/low
  • user-defined types #severity/medium #urgency/low
    • CREATE TYPE (#ref/unimplemented_sql/ur33y834br1wa): user defined types (see [Postgres](https://www.postgresql.org/docs/16/sql-createtype.html, DataFusion issue)
    • DROP TYPE (#ref/unimplemented_sql/t1fti3yz8jsdn)
  • collation, as in locale-specific ordering:
  • procedures #severity/medium #urgency/low
    • CREATE PROCEDURE (#ref/unimplemented_sql/69jzm59u8c6jc)
    • CALL (#ref/unimplemented_sql/dx5ic5bkaaq7c): see Postgres
    • DROP PROCEDURE (#ref/unimplemented_sql/w173ctuodtcaa)
  • triggers #severity/medium #urgency/low
    • CREATE TRIGGER (#ref/unimplemented_sql/cxts4du59pwqn)
    • DROP TRIGGER (#ref/unimplemented_sql/i1omqjczqksqk)
  • KILL [CONNECTION] (#ref/unimplemented_sql/jqji1j6my4f5w): #severity/medium #urgency/low
  • KILL QUERY (#ref/unimplemented_sql/77ecapaed3bso): #severity/medium #urgency/low
  • MERGE (#ref/unimplemented_sql/y5gzyy5kxg5ws): #severity/low #urgency/low
  • CREATE TABLE .. COMMENT (#ref/unimplemented_sql/dgyhzknpfx756)
  • CREATE TABLE with column option COMMENT (#ref/unimplemented_sql/4itjrujemcztq) #severity/medium #urgency/low
  • miscellaneous column options
  • COMMENT ON (#ref/unimplemented_sql/e8uhkzz1krcgw): #severity/low #urgency/low
  • populate table with data at creation time #severity/low #urgency/low
  • READ COMMITTED: currently upgrading to READ REPEATABLE, could support directly (#ref/bm4q45m5ajz9s)
  • START TRANSACTION ISOLATION LEVEL SERIALIZABLE (#ref/unimplemented_sql/dy1ch15p7n3ze): maybe just a better error message: #severity/low #urgency/low
    • i think supporting this requires two things: TransactionOptions::set_snapshot(true) and every get is upgraded to get_for_update.
    • then marking transactions as READ ONLY can relax that for performance
  • cursors (see Postgres; not in standard interactive SQL) #severity/low #urgency/low
    • DECLARE (#ref/unimplemented_sql/159jmakdyuewy)
    • FETCH (#ref/unimplemented_sql/9nf8hju6ie9e1)
    • CLOSE (#ref/unimplemented_sql/gc1557zek4hne)
  • row-level security (see Postgres)
    • CREATE POLICY (#ref/unimplemented_sql/gce4sb6hptjfy)
    • DROP POLICY (#ref/unimplemented_sql/n9zgxwn6736wr)
    • ALTER POLICY (#ref/unimplemented_sql/wj1tnq4hkth7s)
  • DISCARD (#ref/unimplemented_sql/9a5zkcwgbhewh): (see Postgres): #severity/low #urgency/low
  • REPLACE INTO (MySQL syntax) or INSERT OR REPLACE (SQLite syntax) (#ref/unimplemented_sql/54xqh7ornzpso): #severity/low #urgency/low
  • CREATE OR REPLACE TABLE (#ref/unimplemented_sql/65so99qr1hyck): #severity/low #urgency/low
  • ASSERT (PowerSQL syntax) (#ref/unimplemented_sql/qq1s1fq45ztdw): #severity/low #urgency/low
  • CREATE TABLE .. LIKE (#ref/unimplemented_sql/xutp6m7rjnc54)
  • CREATE TABLE .. CLONE (#ref/unimplemented_sql/7b5kah43yqquc): optionally also copies data; likely RocksDB backend will not support that option
  • CREATE TABLE .. WITH (#ref/unimplemented_sql/3kozdfoeczmrk): see Postgres, likely parse superficially and error if given #severity/low #urgency/low
  • RENAME TABLE (MySQL syntax) (#ref/unimplemented_sql/85ax9gtrkcneg)
  • CREATE TABLE with column option CHARACTER SET (#ref/unimplemented_sql/ifi7dygtua3oc): parse superficially and refuse non-utf8 #severity/low #urgency/low
  • CREATE TABLE .. DEFAULT CHARSET (MySQL syntax) (#ref/unimplemented_sql/dcan9mz8g4eja): parse superficially and refuse non-utf8 #urgency/low #severity/low #size/small
  • relaxing constraints to end of transaction:
    • CREATE TABLE with table constraint PRIMARY KEY .. DEFERRABLE (#ref/unimplemented_sql/uzf4oabd178rr)
    • CREATE TABLE with table constraint PRIMARY KEY .. INITIALLY DEFERRED (#ref/unimplemented_sql/ysnkhzw6ukbms)
  • BEGIN {DEFERRED | IMMEDIATE | EXCLUSIVE} (SQLite syntax) (#ref/unimplemented_sql/o969ikdsjz76r): #severity/low #urgency/low
  • CREATE TABLE with column option ON CONFLICT (SQLite syntax) (#ref/unimplemented_sql/ehefbqgnnfqwe) #severity/medium #urgency/low
  • PRAGMA (SQLite syntax) (#ref/unimplemented_sql/a1k38hk48fpzr): #severity/low #urgency/low
  • SHOW CREATE (MySQL syntax) (#ref/unimplemented_sql/c7cewp15zm5ys)
  • SHOW STATUS (MySQL syntax) (#ref/unimplemented_sql/ejj6ir6dsxhrc)
  • SHOW DATABASES (MySQL syntax) (#ref/unimplemented_sql/kskgotui7rupc)
  • SHOW COLUMNS (MySQL syntax) (#ref/unimplemented_sql/zx98bjxguht5o)
  • SHOW VARIABLES (MySQL syntax) (#ref/unimplemented_sql/n5cs7ue5gzpde)
  • SHOW SCHEMAS (Snowflake syntax) (#ref/unimplemented_sql/poi5oxunjyq5a)
  • SHOW VIEWS (Snowflake syntax) (#ref/unimplemented_sql/zzi1wb3jq5jeh)
  • { EXPLAIN | DESCRIBE } TABLE (MySQL syntax) (#ref/unimplemented_sql/4tpmma7ewdf3a): #severity/low #urgency/low
  • CREATE TABLE .. AUTO_INCREMENT (MySQL syntax) (#ref/unimplemented_sql/18kxgbi8rzb5r): #severity/low #urgency/low
  • USE (#ref/unimplemented_sql/ng34e6kt6rrt4): #severity/low #urgency/low
  • CREATE TABLE with table constraint PRIMARY KEY .. COMMENT (MySQL syntax) (#ref/unimplemented_sql/ss9iqz1c8ffrh)
  • MySQL-style names on constraints or indexes:
    • CREATE TABLE with table constraint PRIMARY KEY given a name (MySQL syntax) (#ref/unimplemented_sql/5rr7ouyasn1dk)
    • CREATE TABLE with table constraint PRIMARY KEY given an index name (MySQL syntax) (#ref/unimplemented_sql/mexuaio6ngad4)
  • CREATE TABLE with table constraint UNIQUE {INDEX | KEY} (MySQL syntax) (#ref/unimplemented_sql/m3op6enhb5pgw)
  • CREATE TABLE with table constraint PRIMARY KEY .. NOT ENFORCED (MySQL syntax) (#ref/unimplemented_sql/u339u8jzzkqdy)
  • CREATE TABLE with column option ON UPDATE timestamp function (MySQL syntax) (#ref/unimplemented_sql/zaub41ukgqfmq) #severity/low #urgency/low
  • CREATE TABLE with column option MATERIALIZED (ClickHouse syntax) (#ref/unimplemented_sql/ouq6dfrgbxn9w) #severity/low #urgency/low
  • CREATE TABLE with column option EPHEMERAL (ClickHouse syntax) (#ref/unimplemented_sql/78qipewmqn31h) #severity/low #urgency/low
  • CREATE TABLE with column option ALIAS (ClickHouse syntax) (#ref/unimplemented_sql/1adkef6yd1qgc) #severity/low #urgency/low
  • CREATE TABLE with column option OPTIONS (BigQuery syntax) (#ref/unimplemented_sql/7drpfhozbre1k) #severity/low #urgency/low
  • CREATE TABLE with column option TAG (Snowflake syntax) (#ref/unimplemented_sql/sjpe4f636kn8n) #severity/low #urgency/low
  • CREATE TABLE with column option { MASKING POLICY | PROJECTION POLICY } (Snowflake syntax) (#ref/unimplemented_sql/ynucdy5u7a35w) #severity/low #urgency/low
  • CREATE TABLE with weird column option (#ref/unimplemented_sql/4jqhus4cj4fho) #severity/low #urgency/low

There are plenty of corners of the SQL syntax we have chosen to not support. These are marked in the source code as NotSupportedSql. We may have judged wrong; let us know if you need some non-supported syntax.

Later

Features and improvements we would like to work on, but that aren't started yet.

Maybe

Possible features we have not at this time committed to implementing, or that are so far out that we just don't have a clear picture of how to implement them.

Small cleanups

  • exact_zip helper: we use Iterator::zip in many places where having non-equal lengths would be a bug, systematize

Ecosystem wishlist

DataFusion

Specific bugs and missing features:

Architectural:

sqlparser-rs

sqllogictest-rs

clippy (?)

tracing

Rust

misc


Details

RocksDB bind to C++ directly #maybe

RocksDB is a C++ library we're currently using through a less-featureful C shim. This crate allows calling C++ directly: https://github.com/dtolnay/cxx/. What are the downsides?

See also https://github.com/cozodb/cozo/tree/main/cozorocks, https://crates.io/crates/autocxx.

Compare to https://github.com/hkalbasi/zngur, https://crates.io/crates/cpp.

Prior work at https://github.com/google/autocxx/issues/1073. autocxx has been stagnant since 2022, so I'm pessimistic on this approach now.

MSRV policy

Streaming remote backup

Streaming remote backup for RocksDB backend

Coding conventions

Prefer lints over documenting things here. What needs to be explicitly said, we do not have a lint for.

C pointers

  • Name the struct field raw (or raw_* for multiple).
  • Use std::ptr::NonNull wherever possible.

Keepalives

  • Name the struct field _keepalive_*, or if it needs to be assigned to after creation keepalive_*.
  • Put keepalives at the end of the struct; struct fields are dropped in declaration order.

Reference count keepalives

  • If A depends on B (for example, rocksdb_a_destroy must be called before rocksdb_b_destroy), and doesn't naturally refer to B, add a struct field _keepalive_b: B (or Arc<B>, if the reference count is not hidden as an inner field).

C pointer keepalives

  • If you are passing Rust-allocated memory to C, where C holds on to that memory past the function call, use std::pin::Pin to ensure Rust does not move the data from underneath.

No lint leftovers

(Needed to state justification for manual #[allow]s.)

Use #[expect(...)] instead of #[allow(...)] to allow lints.

Acceptable exceptions:

  • macros or other generated code that only sometimes triggers the lint rule

For lints that trigger only with some architectures/platforms/configurations, express that with cfg.

Indent don't align

Trying to align "columns" in source code where the amount of content in the "left column" varies leads to unnecessary code churn; don't do it.

Examples:

  • aligning # or // in inline comments
  • aligning = in assignments

tx means database transactions, not transmit

The identifier tx shall refer to database transactions, or lower-level transactions such as RocksDB transaction used to implement a database transaction.

For consistency, avoid txn.

Ends of channels are called "sender" and "receiver", not tx/rx.

Rust value builders have .build() methods

For consistency, avoid .finish(). Transactions are finished.

Inner error fields are named error not source

Prefer Box<[T]> over Vec when the length does not change

Also collect directly into Box<[T]>, not to Vec<T> only to call Vec::into_boxed_slice.

Check vs validate vs assert

To validate user input, parse don't validate: fn parse(input: &[u8]) -> Result<..., KantoError>.

Check non-parsing related operational preconditions, where failing the check is within normal operation and depends on user input. Think 42.checked_add(13). Typically fn check_foo(...) -> Result<(), KantoError>.

Validate invariants and such that are always expected to be true, and failing means corruption or internal error. Typically fn validate_foo(...) -> Result<(), KantoError>.

Assert additional, potentially costly, checks that are only done in debug builds. Typically fn assert_foo(...) with no return value.

It is less painful to type tracing::debug! than to deal with import churn

Sometimes being explicit and verbose is actually the shorter and simpler way.

Manifesto

(For things we do have a lint for.)

Panic policy

The codebase still has undesirable use of .unwrap() or .expect(). We are working to limit their use.

KantoDB should not panic regardless of library API misuse, corrupt or malicious database files, or user input (SQL queries, network protocol). Please report it on the community forum if you see an unexpected panic.

  • using .unwrap() can be ok during development, but typically not in final code
  • can panic on out of memory; for now, this is the easy path, waiting for Rust Allocator changes and ecosystem to catch up
  • ideally, never panic on corrupt data, but this is tough to enforce if panic on internal errors is allowed; fuzzing is only probabilistic
  • debug mode only invariant checking can panic (use debug_assert!); this should be used only where it is impossible to trigger by file/network input
  • open question: should we panic on internal errors where we "know it will be Some" but can't express that through APIs; this should be used only where it is impossible to trigger by file/network input
    • current plan is to unify to .expect("internal error XYZ: ...") style, add lint that those are the only expects, and then let them stay in the code as long as a fuzzer doesn't hit them
  • never panic on I/O error; however, default policy might change to controlled stop & recommend reboot

Resources:

Decisions

Not using TableProvider.insert_into

That's a fine convenience, but since there's nothing comparable for UPDATE or DELETE, we're going to need the less convenient code path anyway.

Links:

Developer glossary

Composite key

A key consisting of multiple values.

Primary key

Primary key identifies a row for tables that have PRIMARY KEY specified in the schema. It is a Composite key.

Row key

Whatever identifies a row, whether a Row ID or Primary key.

Row ID

Row ID identifies a row for tables that do not have PRIMARY KEY specified in the schema. It is a numeric sequential identifier automatically assigned for rows as needed.

Resources

Unorganized