Refine
Document Type
- Conference proceeding (18)
- Journal article (3)
- Book chapter (3)
- Doctoral Thesis (1)
Language
- English (25)
Has full text
- yes (25)
Is part of the Bibliography
- yes (25)
Institute
- Informatik (25)
Publisher
Massive data transfers in modern key/value stores resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-data processing (NDP) designs represent a feasible solution, which although not new, have yet to see widespread use.
In this paper we introduce nKV, which is a key/value store utilizing native computational storage and near-data processing. On the one hand, nKV can directly control the data and computation placement on the underlying storage hardware. On the other hand, nKV propagates the data formats and layouts to the storage device where, software and hardware parsers and accessors are implemented. Both allow NDP operations to execute in host-intervention-free manner, directly on physical addresses and thus better utilize the underlying hardware. Our performance evaluation is based on executing traditional KV operations (GET, SCAN) and on complex graph-processing algorithms (Betweenness Centrality) in-situ, with 1.4×-2.7× better performance on real hardware – the COSMOS+ platform.
Massive data transfers in modern data intensive systems resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-data processing (NDP) and a shift to code-to-data designs may represent a viable solution as packaging combinations of storage and compute elements on the same device has become viable.
The shift towards NDP system architectures calls for revision of established principles. Abstractions such as data formats and layouts typically spread multiple layers in traditional DBMS, the way they are processed is encapsulated within these layers of abstraction. The NDP-style processing requires an explicit definition of cross-layer data formats and accessors to ensure in-situ executions optimally utilizing the properties of the underlying NDP storage and compute elements. In this paper, we make the case for such data format definitions and investigate the performance benefits under NoFTL-KV and the COSMOS hardware platform.
nKV in action: accelerating KVstores on native computational storage with NearData processing
(2020)
Massive data transfers in modern data intensive systems resulting from low data-locality and data-to-code system design hurt their performance and scalability. Near-data processing (NDP) designs represent a feasible solution, which although not new, has yet to see widespread use.
In this paper we demonstrate various NDP alternatives in nKV, which is a key/value store utilizing native computational storage and near-data processing. We showcase the execution of classical operations (GET, SCAN) and complex graph-processing algorithms (Betweenness Centrality) in-situ, with 1.4x-2.7x better performance due to NDP. nKV runs on real hardware - the COSMOS+ platform.
We introduce IPA-IDX – an approach to handle index modifications modern storage technologies (NVM, Flash) as physical in-place appends, using simplified physiological log records. IPA-IDX provides similar performance and longevity advantages for indexes as basic IPA [5] does for tables. The selective application of IPA-IDX and basic IPA to certain regions and objects, lowers the GC overhead by over 60%, while keeping the total space overhead to 2%. The combined effect of IPA and IPA-IDX increases performance by 28%.
Many modern DBMS architectures require transferring data from storage to process it afterwards. Given the continuously increasing amounts of data, data transfers quickly become a scalability limiting factor. Near-Data Processing and smart/computational storage emerge as promising trends allowing for decoupled in-situ operation execution, data transfer reduction and better bandwidth utilization. However, not every operation is suitable for an in-situ execution and a careful placement and optimization is needed.
In this paper we present an NDP-aware cost model. It has been implemented in MySQL and evaluated with nKV. We make several observations underscoring the need for optimization.
Near-Data Processing is a promising approach to overcome the limitations of slow I/O interfaces in the quest to analyze the ever-growing amount of data stored in database systems. Next to CPUs, FPGAs will play an important role for the realization of functional units operating close to data stored in non-volatile memories such as Flash.It is essential that the NDP-device understands formats and layouts of the persistent data, to perform operations in-situ. To this end, carefully optimized format parsers and layout accessors are needed. However, designing such FPGA-based Near-Data Processing accelerators requires significant effort and expertise. To make FPGA-based Near-Data Processing accessible to non-FPGA experts, we will present a framework for the automatic generation of FPGA-based accelerators capable of data filtering and transformation for key-value stores based on simple data-format specifications.The evaluation shows that our framework is able to generate accelerators that are almost identical in performance compared to the manually optimized designs of prior work, while requiring little to no FPGA-specific knowledge and additionally providing improved flexibility and more powerful functionality.
Real Time Charging (RTC) applications that reside in the telecommunications domain have the need for extremely fast database transactions. Today´s providers rely mostly on in-memory databases for this kind of information processing. A flexible and modular benchmark suite specifically designed for this domain provides a valuable framework to test the performance of different DB candidates. Besides a data and a load generator, the suite also includes decoupled database connectors and use case components for convenient customization and extension. Such easily produced test results can be used as guidance for choosing a subset of candidates for further tuning/testing and finally evaluating the database most suited to the chosen use cases. This is why our benchmark suite can be of value for choosing databases for RTC use cases.
The amount of image data has been rising exponentially over the last decades due to numerous trends like social networks, smartphones, automotive, biology, medicine and robotics. Traditionally, file systems are used as storage. Although they are easy to use and can handle large data volumes, they are suboptimal for efficient sequential image processing due to the limitation of data organisation on single images. Database systems and especially column-stores support more stuctured storage and access methods on the raw data level for entiere series.
In this paper we propose definitions of various layouts for an efficient storage of raw image data and metadata in a column store. These schemes are designed to improve the runtime behaviour of image processing operations. We present a tool called column-store Image Processing Toolbox (cIPT) allowing to easily combine the data layouts and operations for different image processing scenarios.
The experimental evaluation of a classification task on a real world image dataset indicates a performance increase of up to 15x on a column store compared to a traditional row-store (PostgreSQL) while the space consumption is reduced 7x. With these results cIPT provides the basis for a future mature database feature.
Rapidly growing data volumes push today's analytical systems close to the feasible processing limit. Massive parallelism is one possible solution to reduce the computational time of analytical algorithms. However, data transfer becomes a significant bottleneck since it blocks system resources moving data-to-code. Technological advances allow to economically place compute units close to storage and perform data processing operations close to data, minimizing data transfers and increasing scalability. Hence the principle of Near Data Processing (NDP) and the shift towards code-to-data. In the present paper we claim that the development of NDP-system architectures becomes an inevitable task in the future. Analytical DBMS like HPE Vertica have multiple points of impact with major advantages which are presented within this paper.
Over the last decades, a tremendous change toward using information technology in almost every daily routine of our lives can be perceived in our society, entailing an incredible growth of data collected day-by-day on Web, IoT, and AI applications.
At the same time, magneto-mechanical HDDs are being replaced by semiconductor storage such as SSDs, equipped with modern Non-Volatile Memories, like Flash, which yield significantly faster access latencies and higher levels of parallelism. Likewise, the execution speed of processing units increased considerably as nowadays server architectures comprise up to multiple hundreds of independently working CPU cores along with a variety of specialized computing co-processors such as GPUs or FPGAs.
However, the burden of moving the continuously growing data to the best fitting processing unit is inherently linked to today’s computer architecture that is based on the data-to-code paradigm. In the light of Amdahl's Law, this leads to the conclusion that even with today's powerful processing units, the speedup of systems is limited since the fraction of parallel work is largely I/O-bound.
Therefore, throughout this cumulative dissertation, we investigate the paradigm shift toward code-to-data, formally known as Near-Data Processing (NDP), which relieves the contention on the I/O bus by offloading processing to intelligent computational storage devices, where the data is originally located.
Firstly, we identified Native Storage Management as the essential foundation for NDP due to its direct control of physical storage management within the database. Upon this, the interface is extended to propagate address mapping information and to invoke NDP functionality on the storage device. As the former can become very large, we introduce Physical Page Pointers as one novel NDP abstraction for self-contained immutable database objects.
Secondly, the on-device navigation and interpretation of data are elaborated. Therefore, we introduce cross-layer Parsers and Accessors as another NDP abstraction that can be executed on the heterogeneous processing capabilities of modern computational storage devices. Thereby, the compute placement and resource configuration per NDP request is identified as a major performance criteria. Our experimental evaluation shows an improvement in the execution durations of 1.4x to 2.7x compared to traditional systems. Moreover, we propose a framework for the automatic generation of Parsers and Accessors on FPGAs to ease their application in NDP.
Thirdly, we investigate the interplay of NDP and modern workload characteristics like HTAP. Therefore, we present different offloading models and focus on an intervention-free execution. By propagating the Shared State with the latest modifications of the database to the computational storage device, it is able to process data with transactional guarantees. Thus, we achieve to extend the design space of HTAP with NDP by providing a solution that optimizes for performance isolation, data freshness, and the reduction of data transfers. In contrast to traditional systems, we experience no significant drop in performance when an OLAP query is invoked but a steady and 30% faster throughput.
Lastly, in-situ result-set management and consumption as well as NDP pipelines are proposed to achieve flexibility in processing data on heterogeneous hardware. As those produce final and intermediary results, we continue investigating their management and identified that an on-device materialization comes at a low cost but enables novel consumption modes and reuse semantics. Thereby, we achieve significant performance improvements of up to 400x by reusing once materialized results multiple times.