Crate engine_traits[][src]

A generic TiKV storage engine

This is a work-in-progress attempt to abstract all the features needed by TiKV to persist its data, so that storage engines other than RocksDB may be added to TiKV in the future.

This crate must not have any transitive dependencies on RocksDB. The RocksDB implementation is in the engine_rocks crate.

In addition to documenting the API, this documentation contains a description of the porting process, current design decisions and design guidelines, and refactoring tips.

Capabilities of a TiKV engine

TiKV engines store binary keys and values.

Every pair lives in a column family, which can be thought of as being independent data stores.

Consistent read-only views of the database are accessed through snapshots.

Multiple writes can be committed atomically with a write batch.

The TiKV engine API

The API inherits its design from RocksDB. As support for other engines is added to TiKV, it is expected that this API will become more abstract, and less Rocks-specific.

This crate is almost entirely traits, plus a few “plain-old-data” types that are shared between engines.

Some key types include:

The KvEngine instance generally acts as a factory for types that implement other traits in the crate. These factory methods, associated types, and other associated methods are defined in “extension” traits. For example, methods on engines related to batch writes are in the WriteBatchExt trait.

Design notes

The porting process

These are some guidelines that seem to make the porting managable. As the process continues new strategies are discovered and written here. This is a big refactoring and will take many monthse.

Refactoring is a craft, not a science, and figuring out how to overcome any particular situation takes experience and intuation, but these principles can help.

A guiding principle is to do one thing at a time. In particular, don’t redesign while encapsulating.

The port is happening in stages:

  1. Migrating the engine abstractions
  2. Eliminating direct-use of rocksdb re-exports
  3. “Pulling up” the generic abstractions though TiKV
  4. Isolating test cases from RocksDB

These stages are described in more detail:

1) Migrating the engine abstractions

The engine crate was an earlier attempt to abstract the storage engine. Much of its structure is duplicated near-identically in engine_traits, the difference being that engine_traits has no RocksDB dependencies. Having no RocksDB dependencies makes it trivial to guarantee that the abstractions are truly abstract.

engine also reexports raw bindings from rust-rocksdb for every purpose for which there is not yet an abstract trait.

During this stage, we will eliminate the wrappers from engine to reduce code duplication. We do this by identifying a small subsystem within engine, duplicating it within engine_traits and engine_rocks, deleting the code from engine, and fixing all the callers to work with the abstracted implementation.

At the end of this stage the engine dependency will contain no code except for rust-rocksdb reexports. TiKV will still depend on the concrete RocksDB implementations from engine_rocks, as well as the raw API’s from reexported from the rust-rocksdb crate.

2) Eliminating the engine dep from TiKV with new abstractions

TiKV uses reexported rust-rocksdb APIs via the engine crate. During this stage we need to identify each of these APIs, duplicate them generically in the engine_traits and engine_rocks crate, and convert all callers to use the engine_rocks crate instead.

At the end of this phase the engine crate will be deleted.

3) “Pulling up” the generic abstractions through TiKv

With all of TiKV using the engine_traits traits in conjunction with the concrete engine_rocks types, we can push generic type parameters up through the application. Then we will remove the concrete engine_rocks dependency from TiKV so that it is impossible to re-introduce engine-specific code again.

We will probably introduce some other crate to mediate between multiple engine implementations, such that at the end of this phase TiKV will not have a dependency on engine_rocks.

It will though still have a dev-dependency on engine_rocks for the test cases.

4) Isolating test cases from RocksDB

Eventually we need our test suite to run over multiple engines. The exact strategy here is yet to be determined, but it may begin by breaking the engine_rocks dependency with a new engine_test, that begins by simply wrapping engine_rocks.

Refactoring tips

Re-exports

pub use crate::range::*;
pub use compaction_job::*;

Modules

cf_defs
cf_names
cf_options
compact

Functionality related to compaction

compaction_job
config
db_options
db_vector
encryption
engine
engines
errors
file_system
import
iterable

Iteration over engines and snapshots.

misc

This trait contains miscellaneous features that have not been carefully factored into other traits.

mutable
mvcc_properties
options
peekable
perf_context
properties
raft_engine
range
range_properties

Various metrics related to key ranges

snapshot
sst
sst_partitioner
table_properties
ttl_properties
util
write_batch

Structs

CacheStats
EngineFileSystemInspector
Engines
FileEncryptionInfo
IndexHandle
IndexHandles
IterOptions
MvccProperties
ReadOptions
SSTMetaInfo
SstPartitionerContext
SstPartitionerRequest
TtlProperties
WriteOptions

Enums

DeleteStrategy
EncryptionMethod
Error
PerfContextKind

The raftstore subsystem the PerfContext is being created for.

PerfLevel
SeekKey

A token indicating where an iterator “seek” operation should stop.

SeekMode
SstCompressionType
SstPartitionerResult

Constants

ALL_CFS
CF_DEFAULT
CF_LOCK
CF_RAFT
CF_WRITE
DATA_CFS
DATA_KEY_PREFIX_LEN
LARGE_CFS

Traits

CFNamesExt
CFOptionsExt

Trait for engines with column family options

ColumnFamilyOptions
CompactExt
CompactedEvent
DBOptions

A handle to a database’s options

DBOptionsExt

A trait for engines that support setting global options

DBVector

A type that holds buffers queried from the database.

DecodeProperties
EncryptionKeyManager
ExternalSstFileInfo
FileSystemInspector
ImportExt
IngestExternalFileOptions
Iterable
Iterator

An iterator over a consistent set of keys and values.

KvEngine

A TiKV key-value store

MiscExt
Mutable

A trait implemented by WriteBatch

MvccPropertiesExt
Peekable

Types from which values can be read.

PerfContext

Reports metrics to prometheus

PerfContextExt

Extensions for measuring engine performance.

RaftEngine
RaftLogBatch
RangePropertiesExt
Snapshot

A consistent read-only view of the database.

SstExt
SstPartitioner
SstPartitionerFactory
SstReader

SstReader is used to read an SST file.

SstWriter

SstWriter is used to create sst files that can be added to database later.

SstWriterBuilder

A builder builds a SstWriter.

SyncMutable
TableProperties
TablePropertiesCollection
TablePropertiesCollectionIter
TablePropertiesExt
TablePropertiesKey
TitanDBOptions

Titan-specefic options

TtlPropertiesExt
UserCollectedProperties
WriteBatch

Batches of multiple writes that are committed atomically

WriteBatchExt

Engines that can create write batches

Functions

collect

Collect all items of it into a vector, generally used for tests.

name_to_cf

Type Definitions

CfName
Result