Informatik
Refine
Document Type
Has full text
- yes (4) (remove)
Is part of the Bibliography
- yes (4)
Institute
- Informatik (4)
Publisher
- OpenProceedings (4) (remove)
We introduce bloomRF as a unified method for approximate membership testing that supports both point- and range-queries. As a first core idea, bloomRF introduces novel prefix hashing to efficiently encode range information in the hash-code of the key itself. As a second key concept, bloomRF proposes novel piecewisemonotone hash-functions that preserve local order and support fast range-lookups with fewer memory accesses. bloomRF has near-optimal space complexity and constant query complexity. Although, bloomRF is designed for integer domains, it supports floating-points, and can serve as a multi-attribute filter. The evaluation in RocksDB and in a standalone library shows that it is more efficient and outperforms existing point-range-filters by up to 4× across a range of settings and distributions, while keeping the false-positive rate low.
Even though near-data processing (NDP) can provably reduce data transfers and increase performance, current NDP is solely utilized in read-only settings. Slow or tedious to implement synchronization and invalidation mechanisms between host and smart storage make NDP support for data-intensive update operations difficult. In this paper, we introduce a low-latency cache-coherent shared lock table for update NDP settings in disaggregated memory environments. It utilizes the novel CCIX interconnect technology and is integrated in neoDBMS, a near-data processing DBMS for smart storage. Our evaluation indicates end-to-end lock latencies of ∼80-100ns and robust performance under contention.
Modern mixed (HTAP)workloads execute fast update-transactions and long running analytical queries on the same dataset and system. In multi-version (MVCC) systems, such workloads result in many short-lived versions and long version-chains as well as in increased and frequent maintenance overhead.
Consequently, the index pressure increases significantly. Firstly, the frequent modifications cause frequent creation of new versions, yielding a surge in index maintenance overhead. Secondly and more importantly, index-scans incur extra I/O overhead to determine, which of the resulting tuple versions are visible to the executing transaction (visibility-check) as current designs only store version/timestamp information in the base table – not in the index. Such index-only visibility-check is critical for HTAP workloads on large datasets.
In this paper we propose the Multi Version Partitioned B-Tree (MV-PBT) as a version-aware index structure, supporting index-only visibility checks and flash-friendly I/O patterns. The experimental evaluation indicates a 2x improvement for analytical queries and 15% higher transactional throughput under HTAP workloads. MV-PBT offers 40% higher tx. throughput compared to WiredTiger’s LSM-Tree implementation under YCSB.
In this paper we present our work in progress on revisiting traditional DBMS mechanisms to manage space on native Flash and how it is administered by the DBA. Our observations and initial results show that: the standard logical database structures can be used for physical organization of data on native Flash; at the same time higher DBMS performance is achieved without incurring extra DBA overhead. Initial experimental evaluation indicates a 20% increase in transactional throughput under TPC-C, by performing intelligent data placement on Flash, less erase operations and thus better Flash longevity.