Ja
Refine
Document Type
- Conference proceeding (12)
- Journal article (8)
Language
- English (20)
Has full text
- yes (20)
Is part of the Bibliography
- yes (20)
Institute
- Informatik (20) (remove)
Publisher
- IARIA (15)
- Gesellschaft für Informatik e.V (2)
- Ed2.0Work (1)
Data Integration of heterogeneous data sources relies either on periodically transferring large amounts of data to a physical Data Warehouse or retrieving data from the sources on request only. The latter results in the creation of what is referred to as a virtual Data Warehouse, which is preferable when the use of the latest data is paramount. However, the downside is that it adds network traffic and suffers from performance degradation when the amount of data is high. In this paper, we propose the use of a readCheck validator to ensure the timeliness of the queried data and reduced data traffic. It is further shown that the readCheck allows transactions to update data in the data sources obeying full Atomicity, Consistency, Isolation, and Durability (ACID) properties.
At DBKDA 2019, we demonstrated that StrongDBMS with simple but rigorous optimistic algorithms, provides better performance in situations of high concurrency than major commercial database management systems (DBMS). The demonstration was convincing but the reasons for its success were not fully analysed. There is a brief account of the results below. In this short contribution, we wish to discuss the reasons for the results. The analysis leads to a strong criticism of all DBMS algorithms based on locking, and based on these results, it is not fanciful to suggest that it is time to re-engineer existing DBMS.
This paper reviews suggestions for changes to database technology coming from the work of many researchers, particularly those working with evolving big data. We discuss new approaches to remote data access and standards that better provide for durability and auditability in settings including business and scientific computing. We propose ways in which the language standards could evolve, with proof-of-concept implementations on Github.
Recent work on database application development platforms has sought to include a declarative formulation of a conceptual data model in the application code, using annotations or attributes. Some recent work has used metadata to include the details of such formulations in the physical database, and this approach brings significant advantages in that the model can be enforced across a range of applications for a single database. In previous work, we have discussed the advantages for enterprise integration of typed graph data models (TGM), which can play a similar role in graphical databases, leveraging the existing support for the unified modelling language UML. Ideally, the integration of systems designed with different models, for example, graphical and relational database, should also be supported. In this work, we implement this approach, using metadata in a relational database management system (DBMS).
The typed graph model
(2020)
In recent years, the Graph Model has become increasingly popular, especially in the application domain of social networks. The model has been semantically augmented with properties and labels attached to the graph elements. It is difficult to ensure data quality for the properties and the data structure because the model does not need a schema. In this paper, we propose a schema bound Typed Graph Model with properties and labels. These enhancements improve not only data quality but also the quality of graph analysis. The power of this model is provided by using hyper-nodes and hyper edges, which allows to present a data structure on different abstraction levels. We demonstrate by example the superiority of this model over the property graph data model of Hidders and other prevalent data models, namely the relational, object-oriented, and XML model.
In recent years, the Graph Model has become increasingly popular, especially in the application domain of social networks. The model has been semantically augmented with properties and labels attached to the graph elements. It is difficult to ensure data quality for the properties and the data structure because the model does not need a schema. In this paper, we propose a schema bound Typed Graph Model with properties and labels. These enhancements improve not only data quality but also the quality of graph analysis. The power of this model is provided by using hyper-nodes and hyper-edges, which allows to present data structures on different abstraction levels. We prove that the model is at least equivalent in expressive power to most popular data models. Therefore, it can be used as a supermodel for model management and data integration. We illustrate by example the superiority of this model over the property graph data model of Hidders and other prevalent data models, namely the relational, object-oriented, XML model, and RDF Schema.
"Learning by doing" in Higher Education in technical disciplines is mostly realized by hands-on labs. It challenges the exploratory aptitude and curiosity of a person. But, exploratory learning is hindered by technical situations that are not easy to establish and to verify. Technical skills are, however, mandatory for employees in this area. On the other side, theoretical concepts are often compromised by commercial products. The challenge is to contrast and reconcile theory with practice. Another challenge is to implement a self-assessment and grading scheme that keeps up with the scalability of e-learning courses. In addition, it should allow the use of different commercial products in the labs and still grade the assignment results automatically in a uniform way. In two European Union funded projects we designed, implemented, and evaluated a unique e-learning reference model, which realizes a modularized teaching concept that provides easily reproducible virtual hands-on labs. The novelty of the approach is to use software products of industrial relevance to compare with theory and to contrast different implementations. In a sample case study, we demonstrate the automated assessment for the creative database modeling and design task. Pilot applications in several European countries demonstrated that the participants gained highly sustainable competences that improved their attractiveness for employment.
Schema and data integration have been a challenge for more than 40 years. While data warehouse technologies are quite a success story, there is still a lack of information integration methods, especially if the data sources are based on different data models or do not have a schema. Enterprise Information Integration has to deal with heterogeneous data sources and requires up-to-date high-quality information to provide a reliable basis for analysis and decision-making. The paper proposes virtual integration using the Typed Graph Model to support schema mediation. The integration process first converts the structure of each source into a typed graph schema, which is then matched to the mediated schema. Mapping rules define transformations between the schemata to reconcile semantics. The mapping can be visually validated by experts. It provides indicators and rules to achieve a consistent schema mapping, which leads to high data integrity and quality.
This paper presents a concurrency control mechanism that does not follow a ‘one concurrency control mechanism fits all needs’ strategy. With the presented mechanism a transaction runs under several concurrency control mechanisms and the appropriate one is chosen based on the accessed data. For this purpose, the data is divided into four classes based on its access type and usage (semantics). Class O (the optimistic class) implements a first-committer-wins strategy, class R (the reconciliation class) implements a first-n-committers-win strategy, class P (the pessimistic class) implements a first reader-wins strategy, and class E (the escrow class) implements a firsnreaderswin strategy. Accordingly, the model is called OjRjPjE. Under this model the TPC-C benchmark outperforms other CC mechanisms like optimistic Snapshot Isolation.
This paper presents a concurrency control mechanism that does not follow a "one concurrency control mechanism fits all needs" strategy. With the presented mechanism a transaction runs under several concurrency control mechanisms and the appropriate one is chosen based on the accessed data. For this purpose, the data is divided into four classes based on its access type and usage (semantics). Class O (the optimistic class) implements a first-committer-wins strategy, class R (the reconciliation class) implements a first-n-committers-win strategy, class P (the pessimistic class) implements a first-reader-wins strategy, and class E (the escrow class) implements a first-n-readers-win strategy. Accordingly, the model is called OjRjPjE. The selected concurrency control mechanism may be automatically adapted at run-time according to the current load or a known usage profile. This run-time adaptation allows OjRjPjE to balance the commit rate and the response time even under changing conditions. OjRjPjE outperforms the Snapshot Isolation concurrency control in terms of response time by a factor of approximately 4.5 under heavy transactional load (4000 concurrent transactions). As consequence, the degree of concurrency is 3.2 times higher.