Informatik
Refine
Document Type
- Conference proceeding (8) (remove)
Language
- English (8)
Has full text
- yes (8)
Is part of the Bibliography
- yes (8)
Institute
- Informatik (8)
Publisher
- IARIA (8) (remove)
Recent work on database application development platforms has sought to include a declarative formulation of a conceptual data model in the application code, using annotations or attributes. Some recent work has used metadata to include the details of such formulations in the physical database, and this approach brings significant advantages in that the model can be enforced across a range of applications for a single database. In previous work, we have discussed the advantages for enterprise integration of typed graph data models (TGM), which can play a similar role in graphical databases, leveraging the existing support for the unified modelling language UML. Ideally, the integration of systems designed with different models, for example, graphical and relational database, should also be supported. In this work, we implement this approach, using metadata in a relational database management system (DBMS).
Schema and data integration have been a challenge for more than 40 years. While data warehouse technologies are quite a success story, there is still a lack of information integration methods, especially if the data sources are based on different data models or do not have a schema. Enterprise Information Integration has to deal with heterogeneous data sources and requires up-to-date high-quality information to provide a reliable basis for analysis and decision-making. The paper proposes virtual integration using the Typed Graph Model to support schema mediation. The integration process first converts the structure of each source into a typed graph schema, which is then matched to the mediated schema. Mapping rules define transformations between the schemata to reconcile semantics. The mapping can be visually validated by experts. It provides indicators and rules to achieve a consistent schema mapping, which leads to high data integrity and quality.
At DBKDA 2019, we demonstrated that StrongDBMS with simple but rigorous optimistic algorithms, provides better performance in situations of high concurrency than major commercial database management systems (DBMS). The demonstration was convincing but the reasons for its success were not fully analysed. There is a brief account of the results below. In this short contribution, we wish to discuss the reasons for the results. The analysis leads to a strong criticism of all DBMS algorithms based on locking, and based on these results, it is not fanciful to suggest that it is time to re-engineer existing DBMS.
The typed graph model
(2020)
In recent years, the Graph Model has become increasingly popular, especially in the application domain of social networks. The model has been semantically augmented with properties and labels attached to the graph elements. It is difficult to ensure data quality for the properties and the data structure because the model does not need a schema. In this paper, we propose a schema bound Typed Graph Model with properties and labels. These enhancements improve not only data quality but also the quality of graph analysis. The power of this model is provided by using hyper-nodes and hyper edges, which allows to present a data structure on different abstraction levels. We demonstrate by example the superiority of this model over the property graph data model of Hidders and other prevalent data models, namely the relational, object-oriented, and XML model.
This work presents a disconnected transaction model able to cope with the increased complexity of longliving, hierarchically structured, and disconnected transactions. Wecombine an Open and Closed Nested Transaction Model with Optimistic Concurrency Control and interrelate flat transactions with the aforementioned complex nature. Despite temporary inconsistencies during a transaction’s execution our model ensures consistency.
This paper presents a concurrency control mechanism that does not follow a ‘one concurrency control mechanism fits all needs’ strategy. With the presented mechanism a transaction runs under several concurrency control mechanisms and the appropriate one is chosen based on the accessed data. For this purpose, the data is divided into four classes based on its access type and usage (semantics). Class O (the optimistic class) implements a first-committer-wins strategy, class R (the reconciliation class) implements a first-n-committers-win strategy, class P (the pessimistic class) implements a first reader-wins strategy, and class E (the escrow class) implements a firsnreaderswin strategy. Accordingly, the model is called OjRjPjE. Under this model the TPC-C benchmark outperforms other CC mechanisms like optimistic Snapshot Isolation.
When forecasting sales figures, not only the sales history but also the future price of a product will influence the sales quantity. At first sight, multivariate time series seem to be the appropriate model for this task. Nontheless, in real life history is not always repeatable, i.e. in the case of sales history there is only one price for a product at a given time. This complicates the design of a multivariate time series. However, for some seasonal or perishable products the price is rather a function of the expiration date than of the sales history. This additional information can help to design a more accurate and causal time series model. The proposed solution uses an univariate time series model but takes the price of a product as a parameter that influences systematically the prediction. The price influence is computed based on historical sales data using correlation analysis and adjustable price ranges to identify products with comparable history. Compared to other techniques this novel approach is easy to compute and allows to preset the price parameter for predictions and simulations. Tests with data from the Data Mining Cup 2012 demonstrate better results than established sophisticated time series methods.
The recent years and especially the Internet have changed the way on how data is stored. We now often store data together with its creation time-stamp. These data sequences potentially enable us to track the change of data over time. This is quite interesting, especially in the e-commerce area, in which classification of a sequence of customer actions, is still a challenging task for data miners. However, before Standard algorithms such as Decision Trees, Neuronal Nets, Naive Bayes or Bayesian Belief Networks can be applied on sequential data, preparations need to be done in order to capture the information stored within the sequences. Therefore, this work presents a systematic approach on how to reveal sequence patterns among data and how to construct powerful features out of the primitive sequence attributes. This is achieved by sequence aggregation and the incorporation of time dimension into the Feature construction step. The proposed algorithm is described in detail and applied on a real life data set, which demonstrates the ability of the proposed algorithm to boost the classification performance of well known data mining algorithms for classification tasks.