OPUS 4 | Search

A cost model for NDP-aware query optimization for KV-stores (2021)

Knödler, Christian ; Vinçon, Tobias ; Bernhardt, Arthur ; Petrov, Ilia ; Solis-Vasquez, Leonardo ; Weber, Lukas ; Koch, Andreas

Many modern DBMS architectures require transferring data from storage to process it afterwards. Given the continuously increasing amounts of data, data transfers quickly become a scalability limiting factor. Near-Data Processing and smart/computational storage emerge as promising trends allowing for decoupled in-situ operation execution, data transfer reduction and better bandwidth utilization. However, not every operation is suitable for an in-situ execution and a careful placement and optimization is needed. In this paper we present an NDP-aware cost model. It has been implemented in MySQL and evaluated with nKV. We make several observations underscoring the need for optimization.

Framework for the automatic generation of FPGA-based Near-Data Processing accelerators in smart storage systems (2021)

Weber, Lukas ; Sommer, Lukas ; Solis-Vasquez, Leonardo ; Vinçon, Tobias ; Knödler, Christian ; Bernhardt, Arthur ; Petrov, Ilia ; Koch, Andreas

Near-Data Processing is a promising approach to overcome the limitations of slow I/O interfaces in the quest to analyze the ever-growing amount of data stored in database systems. Next to CPUs, FPGAs will play an important role for the realization of functional units operating close to data stored in non-volatile memories such as Flash.It is essential that the NDP-device understands formats and layouts of the persistent data, to perform operations in-situ. To this end, carefully optimized format parsers and layout accessors are needed. However, designing such FPGA-based Near-Data Processing accelerators requires significant effort and expertise. To make FPGA-based Near-Data Processing accessible to non-FPGA experts, we will present a framework for the automatic generation of FPGA-based accelerators capable of data filtering and transformation for key-value stores based on simple data-format specifications.The evaluation shows that our framework is able to generate accelerators that are almost identical in performance compared to the manually optimized designs of prior work, while requiring little to no FPGA-specific knowledge and additionally providing improved flexibility and more powerful functionality.

AnyDB: an architecture-less DBMS for any workload (2021)

Bang, Tiemo ; May, Norman ; Petrov, Ilia ; Binnig, Carsten

In this paper, we propose a radical new approach for scale-out distributed DBMSs. Instead of hard-baking an architectural model, such as a shared-nothing architecture, into the distributed DBMS design, we aim for a new class of so-called architecture-less DBMSs. The main idea is that an architecture-less DBMS can mimic any architecture on a per-query basis on-the-fly without any additional overhead for reconfiguration. Our initial results show that our architecture-less DBMS AnyDB can provide significant speedup across varying workloads compared to a traditional DBMS implementing a static architecture.

SIASChains: Snapshot Isolation Append Storage Chains (2017)

Gottstein, Robert ; Petrov, Ilia ; Buchmann, Alejandro ; Hardock, Sergej

Asymmetric read/write storage technologies such as Flash are becoming a dominant trend in modern database systems. They introduce hardware characteristics and properties which are fundamentally different from those of traditional storage technologies such as HDDs. Multi-Versioning Database Management Systems (MV-DBMSs) and Log-based Storage Managers (LbSMs) are concepts that can effectively address the properties of these storage technologies but are designed for the characteristics of legacy hardware. A critical component of MV-DBMSs is the invalidation model: commonly, transactional timestamps are assigned to the old and the new version, resulting in two independent (physical) update operations. Those entail multiple random writes as well as in-place updates, sub-optimal for new storage technologies both in terms of performance and endurance. Traditional page-append LbSM approaches alleviate random writes and immediate in-place updates, hence reducing the negative impact of Flash read/write asymmetry. Nevertheless, they entail significant mapping overhead, leading to write amplification. In this work we present an approach called Snapshot Isolation Append Storage Chains (SIAS-Chains) that employs a combination of multi-versioning, append storage management in tuple granularity and novel singly-linked (chain-like) version organization. SIAS-Chains features: simplified buffer management, multi-version indexing and introduces read/write optimizations to data placement on modern storage media. SIAS-Chains algorithmically avoids small in-place updates, caused by in-place invalidation and converts them into appends. Every modification operation is executed as an append and recently inserted tuple versions are co-located.

Multi-version indexing and modern hardware technologies (2017)

Riegger, Christian ; Vinçon, Tobias ; Petrov, Ilia

Characteristics of modern computing and storage technologies fundamentally differ from traditional hardware. There is a need to optimally leverage their performance, endurance and energy consumption characteristics. Therefore, existing architectures and algorithms in modern high performance database management systems have to be redesigned and advanced. Multi Version Concurrency Control (MVCC) approaches in data-base management systems maintain multiple physically independent tuple versions. Snapshot isolation approaches enable high parallelism and concurrency in workloads with almost serializable consistency level. Modern hardware technologies benefit from multi-version approaches. Indexing multi-version data on modern hardware is still an open research area. In this paper, we provide a survey of popular multi-version indexing approaches and an extended scope of high performance single-version approaches. An optimal multi-version index structure brings look-up efficiency of tuple versions, which are visible to transactions, and effort on index maintenance in balance for different workloads on modern hardware technologies.

Write-optimized indexing with partitioned B-trees (2017)

Riegger, Christian ; Vinçon, Tobias ; Petrov, Ilia

Database management systems (DBMS) are critical performance components in large scale applications under modern update intensive workloads. Additional access paths accelerate look-up performance in DBMS for frequently queried attributes, but the required maintenance slows down update performance. The ubiquitous B+ tree is a commonly used key-indexed access path that is able to support many required functionalities with logarithmic access time to requested records. Modern processing and storage technologies and their characteristics require reconsideration of matured indexing approaches for today's workloads. Partitioned B-trees (PBT) leverage characteristics of modern hardware technologies and complex memory hierarchies as well as high update rates and changes in workloads by maintaining partitions within one single B+-Tree. This paper includes an experimental evaluation of PBTs optimized write pattern and performance improvements. With PBT transactional throughput under TPC-C increases 30%; PBT results in beneficial sequential write patterns even in presence of updates and maintenance operations.

Indexing large updatable datasets in multi-version database management systems (2019)

Riegger, Christian ; Vinçon, Tobias ; Petrov, Ilia

Database Management Systems (DBMS) need to handle large updatable datasets in on-line transaction processing (OLTP) workloads. Most modern DBMS provide snapshots of data in multi-version concurrency control (MVCC) transaction management scheme. Each transaction operates on a snapshot of the database, which is calculated from a set of tuple versions. High parallelism and resource-efficient append-only data placement on secondary storage is enabled. One major issue in indexing tuple versions on modern hardware technologies is the high write amplification for tree-indexes. Partitioned B-Trees (PBT) [5] is based on the structure of the ubiquitous B+ Tree [8]. They achieve a near optimal write amplification and beneficial sequential writes on secondary storage. Yet they have not been implemented in a MVCC enabled DBMS to date. In this paper we present the implementation of PBTs in PostgreSQL extended with SIAS. Compared to PostgreSQL’s B+–Trees PBTs have 50% better transaction throughput under TPC-C and a 30% improvement to standard PostgreSQL with Heap-Only Tuples.

Native storage techniques for data management (2019)

Petrov, Ilia ; Koch, Andreas ; Hardock, Sergej ; Vinçon, Tobias ; Riegger, Christian

In the present tutorial we perform a cross-cut analysis of database storage management from the perspective of modern storage technologies. We argue that neither the design of modern DBMS, nor the architecture of modern storage technologies are aligned with each other. Moreover, the majority of the systems rely on a complex multi-layer and compatibility oriented storage stack. The result is needlessly suboptimal DBMS performance, inefficient utilization, or significant write amplification due to outdated abstractions and interfaces. In the present tutorial we focus on the concept of native storage, which is storage operated without intermediate abstraction layers over an open native storage interface and is directly controlled by the DBMS.

nativeNDP: processing big data analytics on native storage nodes (2019)

Vinçon, Tobias ; Hardock, Sergej ; Riegger, Christian ; Koch, Andreas ; Petrov, Ilia

Data analytics tasks on large datasets are computationally intensive and often demand the compute power of cluster environments. Yet, data cleansing, preparation, dataset characterization and statistics or metrics computation steps are frequent. These are mostly performed ad hoc, in an explorative manner and mandate low response times. But, such steps are I/O intensive and typically very slow due to low data locality, inadequate interfaces and abstractions along the stack. These typically result in prohibitively expensive scans of the full dataset and transformations on interface boundaries. In this paper, we examine R as analytical tool, managing large persistent datasets in Ceph, a wide-spread cluster file-system. We propose nativeNDP – a framework for Near Data Processing that pushes down primitive R tasks and executes them in-situ, directly within the storage device of a cluster-node. Across a range of data sizes, we show that nativeNDP is more than an order of magnitude faster than other pushdown alternatives.

IPA-IDX: In-Place Appends for B-tree indices (2019)

Hardock, Sergej ; Koch, Andreas ; Vinçon, Tobias ; Petrov, Ilia

We introduce IPA-IDX – an approach to handle index modifications modern storage technologies (NVM, Flash) as physical in-place appends, using simplified physiological log records. IPA-IDX provides similar performance and longevity advantages for indexes as basic IPA [5] does for tables. The selective application of IPA-IDX and basic IPA to certain regions and objects, lowers the GC overhead by over 60%, while keeping the total space overhead to 2%. The combined effect of IPA and IPA-IDX increases performance by 28%.

Author(s)
Title
Additional person(s)
Publisher
Supervisor(s)
Abstract
Full text

Open Access

Refine

Author

Year of publication

Document Type

Language

Has full text

Is part of the Bibliography

Institute

Publisher

38 search hits