Refine
Document Type
- Conference proceeding (38)
- Journal article (5)
- Book chapter (5)
Has full text
- yes (48)
Is part of the Bibliography
- yes (48)
Institute
- Informatik (48)
Publisher
- Association for Computing Machinery (15)
- Springer (9)
- IEEE (7)
- OpenProceedings (4)
- Gesellschaft für Informatik e.V (3)
- Universität Konstanz (3)
- IARIA (2)
- Association of Computing Machinery (1)
- CIDR (1)
- SciTePress (1)
Under update intensive workloads (TPC, LinkBench) small updates dominate the write behavior, e.g. 70% of all updates change less than 10 bytes across all TPC OLTP workloads. These are typically performed as in-place updates and result in random writes in page-granularity, causing major write-overhead on Flash storage, a write amplification of several hundred times and lower device longevity.
In this paper we propose an approach that transforms those small in-place updates into small update deltas that are appended to the original page. We utilize the commonly ignored fact that modern Flash memories (SLC, MLC, 3D NAND) can handle appends to already programmed physical pages by using various low-level techniques such as ISPP to avoid expensive erases and page migrations. Furthermore, we extend the traditional NSM page-layout with a delta-record area that can absorb those small updates. We propose a scheme to control the write behavior as well as the space allocation and sizing of database pages.
The proposed approach has been implemented under Shore- MT and evaluated on real Flash hardware (OpenSSD) and a Flash emulator. Compared to In-Page Logging it performs up to 62% less reads and writes and up to 74% less erases on a range of workloads. The experimental evaluation indicates: (i) significant reduction of erase operations resulting in twice the longevity of Flash devices under update-intensive workloads; (ii) 15%-60% lower read/write I/O latencies; (iii) up to 45% higher transactional throughput; (iv) 2x to 3x reduction in overall write
amplification.
In this paper we build on our research in data management on native Flash storage. In particular we demonstrate the advantages of intelligent data placement strategies. To effectively manage phsical Flash space and organize the data on it, we utilize novel storage structures such as regions and groups. These are coupled to common DBMS logical structures, thus require no extra overhead for the DBA. The experimental results indicate an improvement of up to 2x, which doubles the longevity of Flash SSD. During the demonstration the audience can experience the advantages of the proposed approach on real Flash hardware.
Active storage
(2018)
In brief, Active Storage refers to an architectural hardware and software paradigm, based on collocation storage and compute units. Ideally, it will allow to execute application-defined data ... within the physical data storage. Thus Active Storage seeks to minimize expensive data movement, improving performance, scalability, and resource efficiency. The effective use of Active Storage mandates new architectures, algorithms, interfaces, and development toolchains.
A transaction is a demarcated sequence of application operations, for which the following properties are guaranteed by the underlying transaction processing system (TPS): atomicity, consistency, isolation, and durability (ACID). Transactions are therefore a general abstraction, provided by TPS that simplifies application development by relieving transactional applications from the burden of concurrency and failure handling. Apart from the ACID properties, a TPS must guarantee high and robust performance (high transactional throughput and low response times), high reliability (no data loss, ability to recover last consistent state, fault tolerance), and high availability (infrequent outages, short recovery times).
The architectures and workhorse algorithms of a high-performance TPS are built around the properties of the underlying hardware. The introduction of nonvolatile memories (NVM) as novel storage technology opens an entire new problem space, with the need to revise aspects such as the virtual memory hierarchy, storage management and data placement, access paths, and indexing. NVM are also referred to as storage-class memory (SCM).
Modern persistent Key/Value stores are designed to meet the demand for high transactional throughput and high data ingestion rates. Still, they rely on backwards-compatible storage stack and abstractions to ease space management, foster seamless proliferation and system integration. Their dependence on the traditional I/O stack has negative impact on performance, causes unacceptably high write-amplification, and limits the storage longevity.
In the present paper we present NoFTL KV, an approach that results in a lean I/O stack, integrating physical storage management natively in the Key/Value store. NoFTL-KV eliminates backwards compatibility, allowing the Key/Value store to directly consume the characteristics of modern storage technologies. NoFTLKV is implemented under RocksDB. The performance evaluation under LinkBench shows that NoFTL-KV improves transactional throughput by 33%, while response times improve up to 2.3x. Furthermore, NoFTL KV reduces write-amplification 19x and improves storage longevity by imately the same factor.
Blockchains yield to new workloads in database management systems and K/V-stores. Distributed Ledger Technology (DLT) is a technique for managing transactions in ’trustless’ distributed systems. Yet, clients of nodes in blockchain networks are backed by ’trustworthy’ K/V-Stores, like LevelDB or RocksDB in Ethereum, which are based on Log-Structured Merge Trees (LSM Trees). However, LSM-Trees do not fully match the properties of blockchains and enterprise workloads.
In this paper, we claim that Partitioned B-Trees (PBT) fit the properties of this DLT: uniformly distributed hash keys, immutability, consensus, invalid blocks, unspent and off-chain transactions, reorganization and data state / version ordering in a distributed log-structure. PBT can locate records of newly inserted key-value pairs, as well as data of unspent transactions, in separate partitions in main memory. Once several blocks acquire consensus, PBTs evict a whole partition, which becomes immutable, to secondary storage. This behavior minimizes write amplification and enables a beneficial sequential write pattern on modern hardware. Furthermore, DLT implicate some type of log-based versioning. PBTs can serve as MV-store for data storage of logical blocks and indexing in multi-version concurrency control (MVCC) transaction processing.
nKV in action: accelerating KVstores on native computational storage with NearData processing
(2020)
Massive data transfers in modern data intensive systems resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-data processing (NDP) designs represent a feasible solution, which although not new, has yet to see widespread use.
In this paper we demonstrate various NDP alternatives in nKV, which is a key/value store utilizing native computational storage and near-data processing. We showcase the execution of classical operations (GET, SCAN) and complex graph-processing algorithms (Betweenness Centrality) in-situ, with 1.4x-2.7x better performance due to NDP. nKV runs on real hardware - the COSMOS+ platform.
Massive data transfers in modern key/value stores resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-data processing (NDP) designs represent a feasible solution, which although not new, have yet to see widespread use.
In this paper we introduce nKV, which is a key/value store utilizing native computational storage and near-data processing. On the one hand, nKV can directly control the data and computation placement on the underlying storage hardware. On the other hand, nKV propagates the data formats and layouts to the storage device where, software and hardware parsers and accessors are implemented. Both allow NDP operations to execute in host-intervention-free manner, directly on physical addresses and thus better utilize the underlying hardware. Our performance evaluation is based on executing traditional KV operations (GET, SCAN) and on complex graph-processing algorithms (Betweenness Centrality) in-situ, with 1.4×-2.7× better performance on real hardware – the COSMOS+ platform.
Massive data transfers in modern data intensive systems resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-data processing (NDP) and a shift to code-to-data designs may represent a viable solution as packaging combinations of storage and compute elements on the same device has become viable.
The shift towards NDP system architectures calls for revision of established principles. Abstractions such as data formats and layouts typically spread multiple layers in traditional DBMS, the way they are processed is encapsulated within these layers of abstraction. The NDP-style processing requires an explicit definition of cross-layer data formats and accessors to ensure in-situ executions optimally utilizing the properties of the underlying NDP storage and compute elements. In this paper, we make the case for such data format definitions and investigate the performance benefits under NoFTL-KV and the COSMOS hardware platform.
The tale of 1000 cores: an evaluation of concurrency control on real(ly) large multi-socket hardware
(2020)
In this paper, we set out the goal to revisit the results of “Starring into the Abyss [...] of Concurrency Control with [1000] Cores” and analyse in-memory DBMSs on today’s large hardware. Despite the original assumption of the authors, today we do not see single-socket CPUs with 1000 cores. Instead multi-socket hardware made its way into production data centres. Hence, we follow up on this prior work with an evaluation of the characteristics of concurrency control schemes on real production multi-socket hardware with 1568 cores. To our surprise, we made several interesting findings which we report on in this paper.