Informatik
Refine
Document Type
- Conference proceeding (31)
- Journal article (2)
Language
- English (33) (remove)
Has full text
- yes (33) (remove)
Is part of the Bibliography
- yes (33)
Institute
- Informatik (33)
- Technik (1)
Publisher
- ACM (33) (remove)
The tale of 1000 cores: an evaluation of concurrency control on real(ly) large multi-socket hardware
(2020)
In this paper, we set out the goal to revisit the results of “Starring into the Abyss [...] of Concurrency Control with [1000] Cores” and analyse in-memory DBMSs on today’s large hardware. Despite the original assumption of the authors, today we do not see single-socket CPUs with 1000 cores. Instead multi-socket hardware made its way into production data centres. Hence, we follow up on this prior work with an evaluation of the characteristics of concurrency control schemes on real production multi-socket hardware with 1568 cores. To our surprise, we made several interesting findings which we report on in this paper.
In this paper, we present a new approach for achieving robust performance of data structures making it easier to reuse the same design for different hardware generations but also for different workloads. To achieve robust performance, the main idea is to strictly separate the data structure design from the actual strategies to execute access operations and adjust the actual execution strategies by means of so-called configurations instead of hard-wiring the execution strategy into the data structure. In our evaluation we demonstrate the benefits of this configuration approach for individual data structures as well as complex OLTP workloads.
The performance and scalability of modern data-intensive systems are limited by massive data movement of growing datasets across the whole memory hierarchy to the CPUs. Such traditional processor-centric DBMS architectures are bandwidth- and latency-bound. Processing-in-Memory (PIM) designs seek to overcome these limitations by integrating memory and processing functionality on the same chip. PIM targets near- or in-memory data processing, leveraging the greater in-situ parallelism and bandwidth.
In this paper, we introduce pimDB and provide an initial comparison of processor-centric and PIM-DBMS approaches under different aspects, such as scalability and parallelism, cache-awareness, or PIM-specific compute/bandwidth tradeoffs. The evaluation is performed end-to-end on a real PIM hardware system from UPMEM.
Real Time Charging (RTC) applications that reside in the telecommunications domain have the need for extremely fast database transactions. Today´s providers rely mostly on in-memory databases for this kind of information processing. A flexible and modular benchmark suite specifically designed for this domain provides a valuable framework to test the performance of different DB candidates. Besides a data and a load generator, the suite also includes decoupled database connectors and use case components for convenient customization and extension. Such easily produced test results can be used as guidance for choosing a subset of candidates for further tuning/testing and finally evaluating the database most suited to the chosen use cases. This is why our benchmark suite can be of value for choosing databases for RTC use cases.
Maintainability assurance techniques are used to control this quality attribute and limit the accumulation of potentially unknown technical debt. Since the industry state of practice and especially the handling of service- and microservice-based systems in this regard are not well covered in scientific literature, we created a survey to gather evidence for a) used processes, tools, and metrics in the industry, b) maintainability-related treatment of systems based on service orientation, and c) influences on developer satisfaction w.r.t. maintainability. 60 software professionals responded to our online questionnaire. The results indicate that using explicit and systematic techniques has benefits for maintainability. The more sophisticated the applied methods the more satisfied participants were with the maintainability of their software while no link to a hindrance in productivity could be established. Other important findings were the absence of architecture-level evolvability control mechanisms as well as a significant neglect of service-oriented particularities for quality assurance. The results suggest that industry has to improve its quality control in these regards to avoid problems with long living service-based software systems.
Towards a practical maintainability quality model for service- and microservice-based systems
(2017)
Although current literature mentions a lot of different metrics related to the maintainability of service-based systems (SBSs), there is no comprehensive quality model (QM) with automatic evaluation and practical focus. To fill this gap, we propose a Maintainability Model for Services (MM4S), a layered maintainability QM consisting of service properties (SPs) related with automatically collectable Service Metrics (SMs). This research artifact created within an ongoing Design Science Research (DSR) project is the first version ready for detailed evaluation and critical feedback. The goal of MM4S is to serve as a simple and practical tool for basic maintainability estimation and control in the context of BSs and their specialization
microservice-based systems (μSBSs).
In a time of digital transformation, the ability to quickly and efficiently adapt software systems to changed business requirements becomes more important than ever. Measuring the maintainability of software is therefore crucial for the long-term management of such products. With service-based systems (SBSs) being a very important form of enterprise software, we present a holistic overview of such metrics specifically designed for this type of system, since traditional metrics – e.g. object oriented ones – are not fully applicable in this case. The selected metric candidates from the literature review were mapped to 4 dominant design properties: size, complexity, coupling, and cohesion. Microservice-based systems (μSBSs) emerge as an agile and fine grained variant of SBSs. While the majority of identified metrics are also applicable to this specialization (with some limitations), the large number of services in combination with technological heterogeneity and decentralization of control significantly impacts automatic metric collection in such a system. Our research therefore suggests that specialized tool support is required to guarantee the practical applicability of the presented metrics to μSBSs.
Painting galleries typically provide a wealth of data composed of several data types. Those multivariate data are too complex for laymen like museum visitors to first, get an overview about all paintings and to look for specific categories. Finally, the goal is to guide the visitor to a specific painting that he wishes to have a more closer look on. In this paper we describe an interactive visualization tool that first provides such an overview and lets people experiment with the more than 41,000 paintings collected in the web gallery of art. To generate such an interactive tool, our technique is composed of different steps like data handling, algorithmic transformations, visualizations, interactions, and the human user working with the tool with the goal to detect insights in the provided data. We illustrate the usefulness of the visualization tool by applying it to such characteristic data and show how one can get from an overview about all paintings to specific paintings.
In this paper we describe an interactive web-based tool for visual analysis of Formula 1 data. A calendar-like representation provides an overview of all races on a yearly basis, either in absolute or normalized time. After selecting a dedicated race more details about this race can be explored. Furthermore it is possible to compare up to three different races. Beside visualizing details on dedicated races it is also possible to analyse driver and team performance over time. A user study was applied to get feedback about the usage of the application and decide between different visualization options.
An index in a Multi-Version DBMS (MV-DBMS) has to reflect different tuple versions of a single data item. Existing approaches follow the paradigm of logically separating the tuple version data from the data item, e.g. an index is only allowed to return at most one version of a single data item (while it may return multiple data items that match a search criteria). Hence to determine the valid (and therefore visible) tuple version of a data item, the MV-DBMS first fetches all tuple versions that match the search criteria and subsequently filters visible versions using visibility checks. This involves I/O storage accesses to tuple versions that do not have to be fetched. In this vision paper we present the Multi Version Index (MV-IDX) approach that allows index-only visibility checks which significantly reduce the amount of I/O storage accesses as well as the index maintenance overhead. The MV-IDX achieves significantly lower response times and higher transactional throughput on OLTP workloads.
Under update intensive workloads (TPC, LinkBench) small updates dominate the write behavior, e.g. 70% of all updates change less than 10 bytes across all TPC OLTP workloads. These are typically performed as in-place updates and result in random writes in page-granularity, causing major write-overhead on Flash storage, a write amplification of several hundred times and lower device longevity.
In this paper we propose an approach that transforms those small in-place updates into small update deltas that are appended to the original page. We utilize the commonly ignored fact that modern Flash memories (SLC, MLC, 3D NAND) can handle appends to already programmed physical pages by using various low-level techniques such as ISPP to avoid expensive erases and page migrations. Furthermore, we extend the traditional NSM page-layout with a delta-record area that can absorb those small updates. We propose a scheme to control the write behavior as well as the space allocation and sizing of database pages.
The proposed approach has been implemented under Shore- MT and evaluated on real Flash hardware (OpenSSD) and a Flash emulator. Compared to In-Page Logging it performs up to 62% less reads and writes and up to 74% less erases on a range of workloads. The experimental evaluation indicates: (i) significant reduction of erase operations resulting in twice the longevity of Flash devices under update-intensive workloads; (ii) 15%-60% lower read/write I/O latencies; (iii) up to 45% higher transactional throughput; (iv) 2x to 3x reduction in overall write
amplification.
Selecting a suitable development method for a specific project context is one of the most challenging activities in process design. Every project is unique and, thus, many context factors have to be considered. Recent research took some initial steps towards statistically constructing hybrid development methods, yet, paid little attention to the peculiarities of context factors influencing method and practice selection. In this paper, we utilize exploratory factor analysis and logistic regression analysis to learn such context factors and to identify methods that are correlated with these factors. Our analysis is based on 829 data points from the HELENA dataset. We provide five base clusters of methods consisting of up to 10 methods that lay the foundation for devising hybrid development methods. The analysis of the five clusters using trained models reveals only a few context factors, e.g., project/product size and target application domain, that seem to significantly influence the selection of methods. An extended descriptive analysis of these practices in the context of the identified method clusters also suggests a consolidation of the relevant practice sets used in specific project contexts.
Many modern DBMS architectures require transferring data from storage to process it afterwards. Given the continuously increasing amounts of data, data transfers quickly become a scalability limiting factor. Near-Data Processing and smart/computational storage emerge as promising trends allowing for decoupled in-situ operation execution, data transfer reduction and better bandwidth utilization. However, not every operation is suitable for an in-situ execution and a careful placement and optimization is needed.
In this paper we present an NDP-aware cost model. It has been implemented in MySQL and evaluated with nKV. We make several observations underscoring the need for optimization.
First International Workshop on Hybrid dEveLopmENt Approaches in Software Systems Development
(2017)
A software process is the game plan to organize project teams and run projects. Yet, it still is a challenge to select the appropriate development approach for the respective context. A multitude of development approaches compete for the users’ favor, but there is no silver bullet serving all possible setups. Moreover, recent research as well as experience from practice shows companies utilizing different development approaches to assemble the bestfitting approach for the respective company: a more traditional process provides the basic framework to serve the organization, while project teams embody this framework with more agile (and/or lean) practices to keep their flexibility. The first HELENA workshop aims to bring together the community to discuss recent findings and to steer future work.
For large-scale processes as implemented in organizations that develop software in regulated domains, comprehensive software process models are implemented, e.g., for compliance requirements. Creating and evolving such processes is demanding and requires software engineers having substantial modeling skills to create consistent and certifiable processes. While teaching process engineering to students, we observed issues in providing and explaining models. In this paper, we present an exploratory study in which we aim to shed light on the challenges students face when it comes to modeling. Our findings show that students are capable of doing basic modeling tasks, yet, fail in utilizing models correctly. We conclude that the required skills, notably abstraction and solution development, are underdeveloped due to missing practice and routine. Since modeling is key to many software engineering disciplines, we advocate for intensifying modeling activities in teaching.
Software process improvement (SPI) is around for decades: frameworks are proposed, success factors are studied, and experiences have been reported. However, the sheer mass of concepts, approaches, and standards published over the years overwhelms practitioners as well as researchers. What is out there? Are there new emerging approaches? What are open issues? Still, we struggle to answer the question for what is the current state of SPI and related research? In this paper, we present initial results from a systematic mapping study to shed light on the field of SPI and to draw conclusions for future research directions. An analysis of 635 publications draws a big picture of SPI-related research of the past 25 years. Our study shows a high number of solution proposals, experience reports, and secondary studies, but only few theories. In particular, standard SPI models like CMMI and ISO/IEC 15504 are analyzed, enhanced, and evaluated for applicability, whereas these standards are critically discussed from the perspective of SPI in small-to- medium-sized companies, which leads to new specialized frameworks. Furthermore, we find a growing interest in success factors to aid companies in conducting SPI.
Software development consists to a large extend of humanbased processes with continuously increasing demands regarding interdisciplinary team work. Understanding the dynamics of software teams can be seen as highly important to successful project execution. Hence, for future project managers, knowledge about non-technical processes in teams is significant. In this paper, we present a course unit that provides an environment in which students can learn and experience the impact of group dynamics on project performance and quality. The course unit uses the Tuckman model as theoretical framework, and borrows from controlled experiments to organize and implement its practical parts in which students then experience the effects of, e.g., time pressure, resource bottlenecks, staff turnover, loss of key personnel, and other stress factors. We provide a detailed design of the course unit to allow for implementation in further software project management courses. Furthermore, we provide experiences obtained from two instances of this unit conducted in Munich and Karlskrona with 36 graduate students. We observed students building awareness of stress factors and developing counter measures to reduce impact of those factors. Moreover, students experienced what problems occur when teams work under stress and how to form a performing team despite exceptional situations.
Using measurement and simulation for understanding distributed development processes in the Cloud
(2017)
Organizations increasingly develop software in a distributed manner. The Cloud provides an environment to create and maintain software-based products and services. Currently, it is widely unknown which software processes are suited for Cloud-based development and what their effects in specific contexts are. This paper presents a process simulation to study distributed development in the Cloud. We contribute a simulation model, which helps analyzing different project parameters and their impact on projects carried out in the Cloud. The simulator helps reproducing activities, developers, issues and events in the project, and it generates statistics, e.g., on throughput, total time, and lead and cycle time. The aim of this simulation model is thus to analyze the tradeoffs regarding throughput, total time, project size, and team size. Furthermore, the modified simulation model aims to help project managers select the most suitable planning alternative. Based on observed projects in Finland and Spain, we simulated a distributed project using artificial and real data. Particularly, we studied the variables project size, team size, throughput, and total project duration. A comparison of the real project data with the results obtained from the simulation shows the simulation producing results close to the real data, and we could successfully replicate a distributed software project. By improving the understanding of distributed development processes, our simulation model thus supports project managers in their decision-making.
This is a report from a one-day fourth international workshop on "Information Systems in Distributed Environments" (ISDE), which was organized in conjunction with the OnTheMove Federated Conferences & Workshops (OTM 2014) October 29-30, 2014, Amantea, Calabria, Italy. The main focus of this event was to provide a venue for the discussion of challenges related to the development, operation, and maintenance of distributed information systems, and their creation in the context of global development projects. Further dissemination of research results will lead to an improvement of distributed information system development and deployment across the globe.
Through increasing market dynamics, rapidly evolving technologies and shifting user expectations coupled with the adoption of lean and agile practices, companies are struggling with their ability to provide reliable product roadmaps by applying traditional approaches. Currently, most companies are seeking opportunities to improve their product roadmapping practices. As a first challenge they have to assess their current product roadmapping capabilities in order to better understand how to improve their practices and how to switch to a new approach. The aim of this article is to provide an initial maturity model for product roadmapping practices that is especially suited for assessing the roadmapping capabilities of companies operating in dynamic and uncertain market environments. Based on interviews with 15 experts from 13 various companies the current state of practice regarding product roadmapping was identified. Afterwards, the model development was conducted in the context of expert workshops with the Robert Bosch GmbH and researchers. The study results in the so-called DEEP 1.0 product roadmap maturity model which allows companies to conduct a self assessment of their product roadmapping practice.