OPUS 4 | Search

MV-IDX : indexing in multi-version databases (2014)

Gottstein, Robert ; Goyal, Rohit ; Hardock, Sergej ; Petrov, Ilia ; Buchmann, Alejandro

An index in a Multi-Version DBMS (MV-DBMS) has to reflect different tuple versions of a single data item. Existing approaches follow the paradigm of logically separating the tuple version data from the data item, e.g. an index is only allowed to return at most one version of a single data item (while it may return multiple data items that match a search criteria). Hence to determine the valid (and therefore visible) tuple version of a data item, the MV-DBMS first fetches all tuple versions that match the search criteria and subsequently filters visible versions using visibility checks. This involves I/O storage accesses to tuple versions that do not have to be fetched. In this vision paper we present the Multi Version Index (MV-IDX) approach that allows index-only visibility checks which significantly reduce the amount of I/O storage accesses as well as the index maintenance overhead. The MV-IDX achieves significantly lower response times and higher transactional throughput on OLTP workloads.

Configurable, energy-efficient, application- and channel-aware middleware approaches for cyber-physical systems (2014)

Nawaz, Khalid ; Petrov, Ilia ; Buchmann, Alejandro

The use of Wireless Sensor and Actuator Networks (WSAN) as an enabling technology for Cyber-Physical Systems has increased significantly in recent past. The challenges that arise in different application areas of Cyber- Physical Systems, in general, and in WSAN in particular, are getting the attention of academia and industry both. Since reliability issues for message delivery in wireless communication are of critical importance for certain safety related applications, it is one of the areas that has received significant focus in the research community. Additionally, the diverse needs of different applications put different demands on the lower layers in the protocol stack, thus necessitating such mechanisms in place in the lower layers which enable them to dynamically adapt. Another major issue in the realization of networked wirelessly communicating cyber-physical systems, in general, and WSAN, in particular, is the lack of approaches that tackle the reliability, configurability and application awareness issues together. One could consider tackling these issues in isolation. However, the interplay between these issues create such challenges that make the application developers spend more time on meeting these challenges, and that too not in very optimal ways, than spending their time on solving the problems related to the application being developed. Starting from some fundamental concepts, general issues and problems in cyber-physical systems, this chapter discusses such issues like energy-efficiency, application and channel-awareness for networked wirelessly communicating cyber-physical systems. Additionally, the chapter describes a middleware approach called CEACH, which is an acronym for Configurable, Energy-efficient, Application- and Channel-aware Clustering based middleware service for cyber-physical systems. The state of-the art in the area of cyberphysical systems with a special focus on communication reliability, configurability, application- and channel-awareness is described in the chapter. The chapter also describes how these features have been considered in the CEACH approach. Important node level and network level characteristics and their significance vis-àvis the design of applications for cyber physical systems is also discussed. The issue of adaptively controlling the impact of these factors vis-à-vis the application demands and network conditions is also discussed. The chapter also includes a description of Fuzzy-CEACH which is an extension of CEACH middleware service and which uses fuzzy logic principles. The fuzzy descriptors used in different stages of Fuzzy-CEACH have also been described. The fuzzy inference engine used in the Fuzzy-CEACH cluster head election process is described in detail. The Rule-Bases used by fuzzy inference engine in different stages of Fuzzy-CEACH is also included to show an insightful description of the protocol. The chapter also discusses in detail the experimental results validating the authenticity of the presented concepts in the CEACH approach. The applicability of the CEACH middleware service in different application scenarios in the domain of cyberphysical systems is also discussed. The chapter concludes by shedding light on the Publish-Subscribe mechanisms in distributed event-based systems and showing how they can make use of the CEACH middleware to reliably communicate detected events to the event-consumers or the actuators if the WSAN is modeled as a distributed event-based system.

Aspects of append-based database storage management on flash memories (2013)

Gottstein, Robert ; Petrov, Ilia ; Buchmann, Alejandro

New storage technologies, such as Flash and Non- Volatile Memories, with fundamentally different properties are appearing. Leveraging their performance and endurance requires a redesign of existing architecture and algorithms in modern high performance databases. Multi-Version Concurrency Control (MVCC) approaches in database systems, maintain multiple timestamped versions of a tuple. Once a transaction reads a tuple the database system tracks and returns the respective version eliminating lock-requests. Hence under MVCC reads are never blocked, which leverages well the excellent read performance (high throughput, low latency) of new storage technologies. Upon tuple updates, however, established implementations of MVCC approaches (such as Snapshot Isolation) lead to multiple random writes – caused by (i) creation of the new and (ii) in-place invalidation of the old version – thus generating suboptimal access patterns for the new storage media. The combination of an append based storage manager operating with tuple granularity and snapshot isolation addresses asymmetry and in-place updates. In this paper, we highlight novel aspects of log-based storage, in multi-version database systems on new storage media. We claim that multi-versioning and append-based storage can be used to effectively address asymmetry and endurance. We identify multi-versioning as the approach to address dataplacement in complex memory hierarchies. We focus on: version handling, (physical) version placement, compression and collocation of tuple versions on Flash storage and in complex memory hierarchies. We identify possible read- and cacherelated optimizations.

Maintaining SOA systems of the future : how can ontological modeling help? (2014)

Gonen, Bilal ; Fang, Xingang ; El-Sheikh, Eman ; Bagui, Sikha ; Wilde, Norman ; Zimmermann, Alfred ; Petrov, Ilia

Many future Services Oriented Architecture (SOA) systems may be pervasive SmartLife applications that provide real-time support for users in everyday tasks and situations. Development of such applications will be challenging, but in this position paper we argue that their ongoing maintenance may be even more so. Ontological modelling of the application may help to ease this burden, but maintainers need to understand a system at many levels, from a broad architectural perspective down to the internals of deployed components. Thus we will need consistent models that span the range of views, from business processes through system architecture to maintainable code. We provide an initial example of such a modelling approach and illustrate its application in a semantic browser to aid in software maintenance tasks.

Sales prediction with parametrized time series analysis (2013)

Schaidnagel, Michael ; Abele, Christian ; Laux, Fritz ; Petrov, Ilia

When forecasting sales figures, not only the sales history but also the future price of a product will influence the sales quantity. At first sight, multivariate time series seem to be the appropriate model for this task. Nontheless, in real life history is not always repeatable, i.e. in the case of sales history there is only one price for a product at a given time. This complicates the design of a multivariate time series. However, for some seasonal or perishable products the price is rather a function of the expiration date than of the sales history. This additional information can help to design a more accurate and causal time series model. The proposed solution uses an univariate time series model but takes the price of a product as a parameter that influences systematically the prediction. The price influence is computed based on historical sales data using correlation analysis and adjustable price ranges to identify products with comparable history. Compared to other techniques this novel approach is easy to compute and allows to preset the price parameter for predictions and simulations. Tests with data from the Data Mining Cup 2012 demonstrate better results than established sophisticated time series methods.

Real time charging database benchmarking (2015)

Bogner, Justus ; Dehner, Carolin ; Vinçon, Tobias ; Petrov, Ilia

Real Time Charging (RTC) applications that reside in the telecommunications domain have the need for extremely fast database transactions. Today´s providers rely mostly on in-memory databases for this kind of information processing. A flexible and modular benchmark suite specifically designed for this domain provides a valuable framework to test the performance of different DB candidates. Besides a data and a load generator, the suite also includes decoupled database connectors and use case components for convenient customization and extension. Such easily produced test results can be used as guidance for choosing a subset of candidates for further tuning/testing and finally evaluating the database most suited to the chosen use cases. This is why our benchmark suite can be of value for choosing databases for RTC use cases.

DBMS on modern storage hardware (2015)

Petrov, Ilia ; Gottstein, Robert ; Hardock, Sergej

In the present tutorial we perform a cross-cut analysis of database systems from the perspective of modern storage technology, namely Flash memory. We argue that neither the design of modern DBMS, nor the architecture of flash storage technologies are aligned with each other. The result is needlessly suboptimal DBMS performance and inefficient flash utilisation as well as low flash storage endurance and reliability. We showcase new DBMS approaches with improved algorithms and leaner architectures, designed to leverage the properties of modern storage technologies. We cover the area of transaction management and multi-versioning, putting a special emphasis on: (i) version organisation models and invalidation mechanisms in multi-versioning DBMS; (ii) Flash storage management especially on append-based storage in tuple granularity; (iii) Flash-friendly buffer management; as well as (iv) improvements in the searching and indexing models. Furthermore, we present our NoFTL approach to native Flash access that integrates parts of the flash-management functionality into the DBMS yielding significant performance increase and simplification of the I/O stack. In addition, we cover the basics of building large Flash storage for DBMS and revisit some of the RAID techniques and principles.

NoFTL for real : databases on real native Flash storage (2015)

Hardock, Sergej ; Petrov, Ilia ; Gottstein, Robert ; Buchmann, Alejandro

Flash SSDs are omnipresent as database storage. HDD replacement is seamless since Flash SSDs implement the same legacy hardware and software interfaces to enable backward compatibility. Yet, the price paid is high as backward compatibility masks the native behaviour, incurs significant complexity and decreases I/O performance, making it non-robust and unpredictable. Flash SSDs are black-boxes. Although DBMS have ample mechanisms to control hardware directly and utilize the performance potential of Flash memory, the legacy interfaces and black-box architecture of Flash devices prevent them from doing so. In this paper we demonstrate NoFTL, an approach that enables native Flash access and integrates parts of the Flashmanagement functionality into the DBMS yielding significant performance increase and simplification of the I/O stack. NoFTL is implemented on real hardware based on the OpenSSD research platform. The contributions of this paper include: (i) a description of the NoFTL native Flash storage architecture; (ii) its integration in Shore-MT and (iii) performance evaluation of NoFTL on a real Flash SSD and on an on-line data-driven Flash emulator under TPCB, C,E and H workloads. The performance evaluation results indicate an improvement of at least 2.4x on real hardware over conventional Flash storage; as well as better utilisation of native Flash parallelism.

Log data analysis through tick discretisation (2016)

Knödler, Christian ; Mößner, Bernhard ; Petrov, Ilia

Nowadays almost every major company has a monitoring system and produces log data to analyse their systems. To perform analysation on the log data and to extract experience for future decisions it is important to transform and synchronize different time series. For synchronizing multiple time series several methods are provided so that they are leading to a synchronized uniform time series. This is achieved by using discretisation and approximation methodics. Furthermore the discretisation through ticks is demonstrated, as well as the respectivly illustrated results.

Near data processing within column-oriented DBMSs for high performance analysis (2016)

Vinçon, Tobias ; Petrov, Ilia

Rapidly growing data volumes push today's analytical systems close to the feasible processing limit. Massive parallelism is one possible solution to reduce the computational time of analytical algorithms. However, data transfer becomes a significant bottleneck since it blocks system resources moving data-to-code. Technological advances allow to economically place compute units close to storage and perform data processing operations close to data, minimizing data transfers and increasing scalability. Hence the principle of Near Data Processing (NDP) and the shift towards code-to-data. In the present paper we claim that the development of NDP-system architectures becomes an inevitable task in the future. Analytical DBMS like HPE Vertica have multiple points of impact with major advantages which are presented within this paper.

Author(s)
Title
Additional person(s)
Publisher
Supervisor(s)
Abstract
Full text

Open Access

Refine

Author

Year of publication

Document Type

Language

Has full text

Is part of the Bibliography

Institute

Publisher

48 search hits