Coding conventions

Prefer lints over documenting things here. What needs to be explicitly said, we do not have a lint for.

C pointers

  • Name the struct field raw (or raw_* for multiple).
  • Use std::ptr::NonNull wherever possible.

Keepalives

  • Name the struct field _keepalive_*, or if it needs to be assigned to after creation keepalive_*.
  • Put keepalives at the end of the struct; struct fields are dropped in declaration order.

Reference count keepalives

  • If A depends on B (for example, rocksdb_a_destroy must be called before rocksdb_b_destroy), and doesn't naturally refer to B, add a struct field _keepalive_b: B (or Arc<B>, if the reference count is not hidden as an inner field).

C pointer keepalives

  • If you are passing Rust-allocated memory to C, where C holds on to that memory past the function call, use std::pin::Pin to ensure Rust does not move the data from underneath.

No lint leftovers

(Needed to state justification for manual #[allow]s.)

Use #[expect(...)] instead of #[allow(...)] to allow lints.

Acceptable exceptions:

  • macros or other generated code that only sometimes triggers the lint rule

For lints that trigger only with some architectures/platforms/configurations, express that with cfg.

Indent don't align

Trying to align "columns" in source code where the amount of content in the "left column" varies leads to unnecessary code churn; don't do it.

Examples:

  • aligning # or // in inline comments
  • aligning = in assignments

tx means database transactions, not transmit

The identifier tx shall refer to database transactions, or lower-level transactions such as RocksDB transaction used to implement a database transaction.

For consistency, avoid txn.

Ends of channels are called "sender" and "receiver", not tx/rx.

Rust value builders have .build() methods

For consistency, avoid .finish(). Transactions are finished.

Inner error fields are named error not source

Prefer Box<[T]> over Vec when the length does not change

Also collect directly into Box<[T]>, not to Vec<T> only to call Vec::into_boxed_slice.

Check vs validate vs assert

To validate user input, parse don't validate: fn parse(input: &[u8]) -> Result<..., KantoError>.

Check non-parsing related operational preconditions, where failing the check is within normal operation and depends on user input. Think 42.checked_add(13). Typically fn check_foo(...) -> Result<(), KantoError>.

Validate invariants and such that are always expected to be true, and failing means corruption or internal error. Typically fn validate_foo(...) -> Result<(), KantoError>.

Assert additional, potentially costly, checks that are only done in debug builds. Typically fn assert_foo(...) with no return value.

It is less painful to type tracing::debug! than to deal with import churn

Sometimes being explicit and verbose is actually the shorter and simpler way.

Manifesto

(For things we do have a lint for.)

Panic policy

The codebase still has undesirable use of .unwrap() or .expect(). We are working to limit their use.

KantoDB should not panic regardless of library API misuse, corrupt or malicious database files, or user input (SQL queries, network protocol). Please report it on the community forum if you see an unexpected panic.

  • using .unwrap() can be ok during development, but typically not in final code
  • can panic on out of memory; for now, this is the easy path, waiting for Rust Allocator changes and ecosystem to catch up
  • ideally, never panic on corrupt data, but this is tough to enforce if panic on internal errors is allowed; fuzzing is only probabilistic
  • debug mode only invariant checking can panic (use debug_assert!); this should be used only where it is impossible to trigger by file/network input
  • open question: should we panic on internal errors where we "know it will be Some" but can't express that through APIs; this should be used only where it is impossible to trigger by file/network input
    • current plan is to unify to .expect("internal error XYZ: ...") style, add lint that those are the only expects, and then let them stay in the code as long as a fuzzer doesn't hit them
  • never panic on I/O error; however, default policy might change to controlled stop & recommend reboot

Resources: