Refine
Document Type
- Conference proceeding (10)
- Book chapter (2)
Has full text
- yes (12)
Is part of the Bibliography
- yes (12)
Institute
- Informatik (12)
Publisher
- Association for Computing Machinery (4)
- Springer (4)
- OpenProceedings (2)
- IEEE (1)
- Universität Konstanz (1)
Database management systems and K/V-Stores operate on updatable datasets – massively exceeding the size of available main memory. Tree-based K/V storage management structures became particularly popular in storage engines. B+ -Trees [1, 4] allow constant search performance, however write-heavy workloads yield in inefficient write patterns to secondary storage devices and poor performance characteristics. LSM-Trees [16, 23] overcome this issue by horizontal partitioning fractions of data – small enough to fully reside in main memory, but require frequent maintenance to sustain search performance.
Firstly, we propose Multi-Version Partitioned BTrees (MV-PBT) as sole storage and index management structure in key-sorted storage engines like K/V-Stores. Secondly, we compare MV-PBT against LSM-Trees. The logical horizontal partitioning in MV-PBT allows leveraging recent advances in modern B+ -Tree techniques in a small transparent and memory resident portion of the structure. Structural properties sustain steady read performance, yielding efficient write patterns and reducing write amplification.
We integrated MV-PBT in the WiredTiger [15] KV storage engine. MV-PBT offers an up to 2× increased steady throughput in comparison to LSM-Trees and several orders of magnitude in comparison to B+ -Trees in a YCSB [5] workload.
We introduce bloomRF as a unified method for approximate membership testing that supports both point- and range-queries. As a first core idea, bloomRF introduces novel prefix hashing to efficiently encode range information in the hash-code of the key itself. As a second key concept, bloomRF proposes novel piecewisemonotone hash-functions that preserve local order and support fast range-lookups with fewer memory accesses. bloomRF has near-optimal space complexity and constant query complexity. Although, bloomRF is designed for integer domains, it supports floating-points, and can serve as a multi-attribute filter. The evaluation in RocksDB and in a standalone library shows that it is more efficient and outperforms existing point-range-filters by up to 4× across a range of settings and distributions, while keeping the false-positive rate low.