Refine
Document Type
- Doctoral Thesis (3)
Language
- English (3)
Has full text
- no (3)
Is part of the Bibliography
- yes (3)
Institute
- Informatik (3)
Publisher
Knowledge is an important resource, whose transfer is still not completely understood. The underlying belief of this thesis is that knowledge cannot be transferred directly from one person to another but must be converted for the transfer and therefore is subject to loss of knowledge and misunderstanding. This thesis proposes a new model for knowledge transfer and empirically evaluates this model. The model is based on the belief that knowledge must be encoded by the sender to transfer it to the receiver, who has to decode the message to obtain knowledge.
To prepare for the model this thesis provides an overview about models for knowledge transfer and factors that influence knowledge transfer. The proposed theoretical model for knowledge transfer is implemented in a prototype to demonstrate its applicability. The model describes the influence of the four layers, namely code, syntactic, semantic, and pragmatic layers, on the encoding and decoding of the message. The precise description of the influencing factors and the overlapping knowledge from sender and receiver facilitate its implementation.
The application area of the layered model for knowledge transfer was chosen to be business process modelling. Business processes incorporate an important knowledge resource of an organisation as they describe the procedures for the production of products and services. The implementation in a software prototype allows a precise description of the process by adding semantic to the simple business process modelling language used.
This thesis contributes to the body of knowledge by providing a new model for knowledge transfer, which shows the process of knowledge transfer in greater detail and highlights influencing factors. The implementation in the area of business process modelling reveals the support provided by the model. An expert evaluation indicates that the implementation of the proposed model supports knowledge transfer in business process modelling. The results of the qualitative evaluation are supported by the findings of a qualitative evaluation, performed as a quasi-experiment with a pre-test/post-test design and two experimental groups and one control group. Mann-Whitney U tests indicated that the group that used the tool that implemented the layered model performed significantly better in terms of completeness (the degree of completeness achieved in the transfer) in comparison with the group that used a standard BPM tool (Z = 3.057, p = 0.002, r = 0.59) and the control group that used pen and paper (Z = 3.859, p < 0.001, r = 0.72). The experiment indicates that the implementation of the layered model supports the creation of a business process and facilitates a more precise representation.
This thesis studies concurrency control and composition of transactions in computing environments with long living transactions where local data autonomy of transactions is indispensable. This kind of computing architecture is referred to as a Disconnected System where reads are segregated -disconnected- from writes enabling local data autonomy. Disconnecting reads from writes is inspired by Bertrand Meyer's "Command Query Separation" pattern. This thesis provides a simple yet precise definition for a Disconnected System with a focus on transaction management. Concerning concurrency control, transaction management frameworks implement a'one concurrency control mechanism fits all needs strategy'. This strategy, however, does not consider specific characteristics of data access. The thesis shows the limitations of this strategy if transaction load increases, transactions are long lived, local data autonomy is required, and serializability is aimed at isolation level. For example, in optimistic mechanisms the number of aborts suddenly increases if load increases. In pessimistic mechanisms locking causes long blocking times and is prone to deadlocks. These findings are not new and a common solution used by database vendors is to reduce the isolation. This thesis proposes the usage of a novel approach. It suggests choosing the concurrency control mechanism according to the semantics of data access of a certain data item. As a result a transaction may execute under several concurrency control mechanisms. The idea is to introduce lanes similar to a motorway where each lane is dedicated to a certain class of vehicle with the same characteristics. Whereas disconnecting reads and writes sets the traffic's direction, the semantics of data access defines the lanes. This thesis introduces four concurrency control classes capturing the semantics of data access and each of them has an associated tailored concurrency control mechanism. Class O (the optimistic class) implements a first-committer-wins strategy, class R (the reconciliation class) implements a first-n-committers-win strategy, class P (the pessimistic class) implements a first-reader-wins strategy, and class E (the escrow class) implements a first-n-readers-win strategy. In contrast to solutions that adapt the concurrency control mechanism during runtime, the idea is to classify data during the design phase of the application and adapt the classification only in certain cases at runtime. The result of the thesis is a transaction management framework called O|R|P|E. A performance study based on the TPC-C benchmark shows that O|R|P|E has a better performance and a considerably higher commit rate than other solutions. Moreover, the thesis shows that in O|R|P|E aborts are due to application specific limitations, i.e., constraint violations and not due to serialization conflicts. This is a result of considering the semantics.
Data collected from internet applications are mainly stored in the form of transactions. All transactions of one user form a sequence, which shows the userĀ“s behaviour on the site. Nowadays, it is important to be able to classify the behaviour in real time for various reasons: e.g. to increase conversion rate of customers while they are in the store or to prevent fraudulent transactions before they are placed. However, this is difficult due to the complex structure of the data sequences (i.e. a mix of categorical and continuous data types, constant data updates) and the large amounts of data that are stored. Therefore, this thesis studies the classification of complex data sequences. It surveys the fields of time series analysis (temporal data mining), sequence data mining or standard classification algorithms. It turns out that these algorithms are either difficult to be applied on data sequences or do not deliver a classification: Time series need a predefined model and are not able to handle complex data types; sequence classification algorithms such as the apriori algorithm family are not able to utilize the time aspect of the data. The strengths and weaknesses of the candidate algorithms are identified and used to build a new approach to solve the problem of classification of complex data sequences. The problem is thereby solved by a two-step process. First, feature construction is used to create and discover suitable features in a training phase. Then, the blueprints of the discovered features are used in a formula during the classification phase to perform the real time classification. The features are constructed by combining and aggregating the original data over the span of the sequence including the elapsed time by using a calculated time axis. Additionally, a combination of features and feature selection are used to simplify complex data types. This allows catching behavioural patterns that occur in the course of time. This new proposed approach combines techniques from several research fields. Part of the algorithm originates from the field of feature construction and is used to reveal behaviour over time and express this behaviour in the form of features. A combination of the features is used to highlight relations between them. The blueprints of these features can then be used to achieve classification in real time on an incoming data stream. An automated framework is presented that allows the features to adapt iteratively to a change in underlying patterns in the data stream. This core feature of the presented work is achieved by separating the feature application step from the computational costly feature construction step and by iteratively restarting the feature construction step on the new incoming data. The algorithm and the corresponding models are described in detail as well as applied to three case studies (customer churn prediction, bot detection in computer games, credit card fraud detection). The case studies show that the proposed algorithm is able to find distinctive information in data sequences and use it effectively for classification tasks. The promising results indicate that the suggested approach can be applied to a wide range of other application areas that incorporate data sequences.