OPUS 4 | Informatik

Introduction to the special issue on self‑managing and hardware‑optimized database systems 2022 (2023)

Data management systems have evolved in terms of functionality, performance characteristics, complexity, and variety during the last 40 years. Particularly, the relational database management systems and the big data systems (e.g., Key-Value stores, Document stores, Graph stores and Graph Computation Systems, Spark, MapReduce/Hadoop, or Data Stream Processing Systems) have evolved with novel additions and extensions. However, the systems administration and tasks have become highly complex and expensive, especially given the simultaneous and rapid hardware evolution in processors, memory, storage, or networking. These developments present new open problems and challenges to data management systems as well as new opportunities. The SMDB (International Workshop on Self-Managing Database Systems) and HardBD&Active (Joint International Workshop on Big Data Management on Emerging Hardware and Data Management on Virtualized Active Systems) workshops organized in conjunction with the IEEE ICDE (International Conference on Data Engineering) offered two distinct platforms for examining the above system-related challenges from different perspectives. The SMDB workshop looks into developing autonomic or self-* features in database and data management systems to tackle complex administrative tasks, while the HardBD&Active workshop focuses on harnessing hardware technologies to enhance efficiency and performance of data processing and management tasks. As a result of these workshops, we are delighted to present the third special issue of DAPD titled “Self-Managing and Hardware-Optimized Database Systems 2022,” which showcases the best contributions from the SMDB 2021/2022 and HardBD&Active 2021/2022 workshops.

pimDB: From main-memory DBMS to processing-in-memory DBMS-engines on intelligent memories (2023)

Bernhardt, Arthur ; Koch, Andreas ; Petrov, Ilia

The performance and scalability of modern data-intensive systems are limited by massive data movement of growing datasets across the whole memory hierarchy to the CPUs. Such traditional processor-centric DBMS architectures are bandwidth- and latency-bound. Processing-in-Memory (PIM) designs seek to overcome these limitations by integrating memory and processing functionality on the same chip. PIM targets near- or in-memory data processing, leveraging the greater in-situ parallelism and bandwidth. In this paper, we introduce pimDB and provide an initial comparison of processor-centric and PIM-DBMS approaches under different aspects, such as scalability and parallelism, cache-awareness, or PIM-specific compute/bandwidth tradeoffs. The evaluation is performed end-to-end on a real PIM hardware system from UPMEM.

NVMulator: a configurable open-source non-volatile memory emulator for FPGAs (2023)

Tamimi, Sajjad ; Bernhardt, Arthur ; Stock, Florian ; Petrov, Ilia ; Koch, Andreas

Near-Data Processing (NDP) is a key computing paradigm for reducing the ever growing time and energy costs of data transport versus computations. With their flexibility, FPGAs are an especially suitable compute element for NDP scenarios. Even more promising is the exploitation of novel and future non-volatile memory (NVM) technologies for NDP, which aim to achieve DRAM-like latencies and throughputs, while providing large capacity non-volatile storage. Experimentation in using FPGAs in such NVM-NDP scenarios has been hindered, though, by the fact that the NVM devices/FPGA boards are still very rare and/or expensive. It thus becomes useful to emulate the access characteristics of current and future NVMs using off-the-shelf DRAMs. If such emulation is sufficiently accurate, the resulting FPGA-based NDP computing elements can be used for actual full-stack hardware/software benchmarking, e.g., when employed to accelerate a database. For this use, we present NVMulator, an open-source easy-to-use hardware emulation module that can be seamlessly inserted between the NDP processing elements on the FPGA and a conventional DRAM-based memory system. We demonstrate that, with suitable parametrization, the emulated NVM can come very close to the performance characteristics of actual NVM technologies, specifically Intel Optane. We achieve 0.62% and 1.7% accuracy for cache line sized accesses for read and write operations, while utilizing only 0.54% of LUT logic resources on a Xilinx/AMD AU280 UltraScale+ FPGA board. We consider both file-system as well as database access patterns, examining the operation of the RocksDB database when running on real or emulated Optane-technology memories.

neoDBMS: in-situ snapshots for multi-version DBMS on native computational storage (2022)

Bernhardt, Arthur ; Tamimi, Sajjad ; Vinçon, Tobias ; Knödler, Christian ; Stock, Florian ; Heniz, Carsten ; Koch, Andreas ; Petrov, Ilia

Multi-versioning and MVCC are the foundations of many modern DBMSs. Under mixed workloads and large datasets, the creation of the transactional snapshot can become very expensive, as long-running analytical transactions may request old versions, residing on cold storage, for reasons of transactional consistency. Furthermore, analytical queries operate on cold data, stored on slow persistent storage. Due to the poor data locality, snapshot creation may cause massive data transfers and thus lower performance. Given the current trend towards computational storage and near-data processing, it has become viable to perform such operations in-storage to reduce data transfers and improve scalability. neoDBMS is a DBMS designed for near-data processing and computational storage. In this paper, we demonstrate how neoDBMS performs snapshot computation in-situ. We showcase different interactive scenarios, where neoDBMS outperforms PostgreSQL 12 by up to 5×.

An evaluation of using CCIX for cache-coherent host-FPGA interfacing (2022)

Tamimi, Sajjad ; Stock, Florian ; Koch, Andreas ; Bernhardt, Arthur ; Petrov, Ilia

For a long time, most discrete accelerators have been attached to host systems using various generations of the PCI Express interface. However, with its lack of support for coherency between accelerator and host caches, fine-grained interactions require frequent cache-flushes, or even the use of inefficient uncached memory regions. The Cache Coherent Interconnect for Accelerators (CCIX) was the first multi-vendor standard for enabling cache-coherent host-accelerator attachments, and already is indicative of the capabilities of upcoming standards such as Compute Express Link (CXL). In our work, we compare and contrast the use of CCIX with PCIe when interfacing an ARM-based host with two generations of CCIX-enabled FPGAs. We provide both low-level throughput and latency measurements for accesses and address translation, as well as examine an application-level use-case of using CCIX for fine-grained synchronization in an FPGA-accelerated database system. We can show that especially smaller reads from the FPGA to the host can benefit from CCIX by having roughly 33% shorter latency than PCIe. Small writes to the host have a latency roughly 32% higher than PCIe, though, since they carry a higher coherency overhead. For the database use-case, the use of CCIX allowed to maintain a constant synchronization latency even with heavy host-FPGA parallelism.

Storage management with multi-version partitioned BTrees (2022)

Riegger, Christian ; Petrov, Ilia

Database management systems and K/V-Stores operate on updatable datasets – massively exceeding the size of available main memory. Tree-based K/V storage management structures became particularly popular in storage engines. B+ -Trees [1, 4] allow constant search performance, however write-heavy workloads yield in inefficient write patterns to secondary storage devices and poor performance characteristics. LSM-Trees [16, 23] overcome this issue by horizontal partitioning fractions of data – small enough to fully reside in main memory, but require frequent maintenance to sustain search performance. Firstly, we propose Multi-Version Partitioned BTrees (MV-PBT) as sole storage and index management structure in key-sorted storage engines like K/V-Stores. Secondly, we compare MV-PBT against LSM-Trees. The logical horizontal partitioning in MV-PBT allows leveraging recent advances in modern B+ -Tree techniques in a small transparent and memory resident portion of the structure. Structural properties sustain steady read performance, yielding efficient write patterns and reducing write amplification. We integrated MV-PBT in the WiredTiger [15] KV storage engine. MV-PBT offers an up to 2× increased steady throughput in comparison to LSM-Trees and several orders of magnitude in comparison to B+ -Trees in a YCSB [5] workload.

Near-data processing in database systems on native computational storage under HTAP workloads (2022)

Vinçon, Tobias ; Knödler, Christian ; Solis-Vasquez, Leonardo ; Bernhardt, Arthur ; Tamimi, Sajjad ; Weber, Lukas ; Stock, Florian ; Koch, Andreas ; Petrov, Ilia

Today’s Hybrid Transactional and Analytical Processing (HTAP) systems, tackle the ever-growing data in combination with a mixture of transactional and analytical workloads. While optimizing for aspects such as data freshness and performance isolation, they build on the traditional data-to-code principle and may trigger massive cold data transfers that impair the overall performance and scalability. Firstly, in this paper we show that Near-Data Processing (NDP) naturally fits in the HTAP design space. Secondly, we propose an NDP database architecture, allowing transactionally consistent in-situ executions of analytical operations in HTAP settings. We evaluate the proposed architecture in state-of-the-art key/value-stores and multi-versioned DBMS. In contrast to traditional setups, our approach yields robust, resource- and cost-effcient performance.

Cache-coherent shared locking for transactionally consistent updates in near-data processing DBMS on smart storage (2022)

Bernhardt, Arthur ; Tamimi, Sajjad ; Stock, Florian ; Vinçon, Tobias ; Koch, Andreas ; Petrov, Ilia

Even though near-data processing (NDP) can provably reduce data transfers and increase performance, current NDP is solely utilized in read-only settings. Slow or tedious to implement synchronization and invalidation mechanisms between host and smart storage make NDP support for data-intensive update operations difficult. In this paper, we introduce a low-latency cache-coherent shared lock table for update NDP settings in disaggregated memory environments. It utilizes the novel CCIX interconnect technology and is integrated in neoDBMS, a near-data processing DBMS for smart storage. Our evaluation indicates end-to-end lock latencies of ∼80-100ns and robust performance under contention.

On the necessity of explicit cross-layer data formats in near-data processing systems (2021)

Weber, Lukas ; Vinçon, Tobias ; Knödler, Christian ; Solis-Vasquez, Leonardo ; Bernhardt, Arthur ; Petrov, Ilia ; Koch, Andreas

Massive data transfers in modern data-intensive systems resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-Data processing (NDP) and a shift to code-to-data designs may represent a viable solution as packaging combinations of storage and compute elements on the same device has become feasible. The shift towards NDP system architectures calls for revision of established principles. Abstractions such as data formats and layouts typically spread multiple layers in traditional DBMS, the way they are processed is encapsulated within these layers of abstraction. The NDP-style processing requires an explicit definition of cross-layer data formats and accessors to ensure in-situ executions optimally utilizing the properties of the underlying NDP storage and compute elements. In this paper, we make the case for such data format definitions and investigate the performance benefits under RocksDB and the COSMOS hardware platform.

Result-set management for NDP operations on smart storage (2022)

Vinçon, Tobias ; Knödler, Christian ; Bernhardt, Arthur ; Solis-Vasquez, Leonardo ; Weber, Lukas ; Koch, Andreas ; Petrov, Ilia

Current data-intensive systems suffer from scalability as they transfer massive amounts of data to the host DBMS to process it there. Novel near-data processing (NDP) DBMS architectures and smart storage can provably reduce the impact of raw data movement. However, transferring the result-set of an NDP operation may increase the data movement, and thus, the performance overhead. In this paper, we introduce a set of in-situ NDP result-set management techniques, such as spilling, materialization, and reuse. Our evaluation indicates a performance improvement of 1.13 × to 400 ×.

Open Access

Informatik

Refine

Author

Year of publication

Document Type

Language

Has full text

Is part of the Bibliography

Institute

Publisher

47 search hits