004 Informatik
Refine
Year of publication
- 2023 (39) (remove)
Document Type
- Conference proceeding (39) (remove)
Is part of the Bibliography
- yes (39)
Institute
- Informatik (29)
- ESB Business School (5)
- Technik (5)
Publisher
- Springer (11)
- IEEE (8)
- Elsevier (5)
- Association for Computing Machinery (2)
- Hochschule Reutlingen (2)
- IARIA (2)
- RWTH Aachen (2)
- University of Hawaii at Manoa (2)
- Association for Information Systems (1)
- Gesellschaft für Informatik e.V (1)
Product engineering and subsequent phases of product lifecycles are predominantly managed in isolation. Companies therefore do not fully exploit potentials through using data from smart factories and product usage. The novel intelligent and integrated Product Lifecycle Management (i²PLM) describes an approach that uses these data for product engineering. This paper describes the i²PLM, shows the cause-and-effect relationships in this context and presents in detail the validation of the approach. The i²PLM is applied and validated on a smart product in an industrial research environment. Here, the subsequent generation of a smart lunchbox is developed based on production and sensor data. The results of the validation give indications for further improvements of the i²PLM. This paper describes how to integrate the i²PLM into a learning factory.
Applications often need to be deployed in different variants due to different customer requirements. However, since modern applications often need to be deployed using multiple deployment technologies in combination, such as Ansible and Terraform, the deployment variability must be considered in a holistic way. To tackle this, we previously developed Variability4TOSCA and the prototype OpenTOSCA Vintner, which is a TOSCA preprocessing and management layer that implements Variability4TOSCA. In this demonstration, we present a detailed case study that shows how to model a deployment using Variability4TOSCA, how to resolve the variability using Vintner, and how the result can be deployed.
The aim of this work is the development of artificial intelligence (AI) application to support the recruiting process that elevates the domain of human resource management by advancing its capabilities and effectiveness. This affects recruiting processes and includes solutions for active sourcing, i.e. active recruitment, pre-sorting, evaluating structured video interviews and discovering internal training potential. This work highlights four novel approaches to ethical machine learning. The first is precise machine learning for ethically relevant properties in image recognition, which focuses on accurately detecting and analysing these properties. The second is the detection of bias in training data, allowing for the identification and removal of distortions that could skew results. The third is minimising bias, which involves actively working to reduce bias in machine learning models. Finally, an unsupervised architecture is introduced that can learn fair results even without ground truth data. Together, these approaches represent important steps forward in creating ethical and unbiased machine learning systems.
Smart cities are considered data factories that generate an enormous amount of data from various sources. In fact data is the backbone of any smart services. Therefore, the strategic beneficial handling of this digital capital is crucial for cities. Some smart city pioneers have already written down their approach to data in the form of data strategies, but what should a city's data strategy include, and how can the goals and measures defined in the strategies be operationalized? This paper addresses these questions by looking closely at the data strategies of cities in Germany and the top three countries in the EU Digital Economy and Society Index. The in-depth analysis of 8 city data strategies has yielded 11 dimensions that cities should consider in their data strategy. These are relevance of data, principles, methods, data sharing, technology, data culture, data ethics, organizational structure, data security and privacy, collaborations, data literacy. In addition, data governance is a concept to put these 11 strategic dimensions into practice through standardization measures, training programs, and defining roles and responsibilities by developing a data catalog.
For large-scale processes as implemented in organizations that develop software in regulated domains, comprehensive software process models are implemented, e.g., for compliance requirements. Creating and evolving such processes is demanding and requires software engineers having substantial modeling skills to create consistent and certifiable processes. While teaching process engineering to students, we observed issues in providing and explaining models. In this paper, we present an exploratory study in which we aim to shed light on the challenges students face when it comes to modeling. Our findings show that students are capable of doing basic modeling tasks, yet, fail in utilizing models correctly. We conclude that the required skills, notably abstraction and solution development, are underdeveloped due to missing practice and routine. Since modeling is key to many software engineering disciplines, we advocate for intensifying modeling activities in teaching.
Most Question-answering (QA) systems rely on training data to reach their optimal performance. However, acquiring training data for supervised systems is both time-consuming and resource-intensive. To address this, in this paper, we propose TFCSG, an unsupervised similar question retrieval approach that leverages pre-trained language models and multi-task learning. Firstly, topic keywords in question sentences are extracted sequentially based on a latent topic-filtering algorithm to construct unsupervised training corpus data. Then, the multi-task learning method is used to build the question retrieval model. There are three tasks designed. The first is a short sentence contrastive learning task. The second is the question sentence and its corresponding topic sequence similarity judgment task. The third is using question sentences to generate their corresponding topic sequence task. The three tasks are used to train the language model in parallel. Finally, similar questions are obtained by calculating the cosine similarity between sentence vectors. The comparison experiment on public question datasets that TFCSG outperforms the comparative unsupervised baseline method. And there is no need for manual marking, which greatly saves human resources.
The fifth generation of mobile communication (5G) is a wireless technology developed to provide reliable, fast data transmission for industrial applications, such as autonomous mobile robots and connect cyber-physical systems using Internet of Things (IoT) sensors. In this context, private 5G networks enable the full performance of industrial applications built on dedicated 5G infrastructures. However, emerging wireless communication technologies such as 5G are a complex and challenging topic for training in learning factories, often lacking physical or visual interaction. Therefore, this paper aims to develop a real-time performance monitoring system of private 5G networks and different industrial 5G devices to visualise the performance and impact factors influencing 5G for students and future connectivity experts. Additionally, this paper presents the first long-term measurements of private 5G networks and shows the performance gap between the actual and targeted performance of private 5G networks.
OpenAPI, WADL, RAML, and API Blueprint are popular formats for documenting Web APIs. Although these formats are in general both human and machine-readable, only the part of the format describing the syntax of a Web API is machine-understandable. Descriptions, which explain the meaning and purpose of Web API elements, are embedded as natural language text snippets into documents and target human readers but not machines. To enable machines to read and process these state-of-practice Web API documentation, we propose a Transformer model that solves the generic task of identifying a Web API element within a syntax structure that matches a natural language query. For our first prototype, we focus on the Web API integration task of matching output with input parameters and fined-tuned a pre-trained CodeBERT model to the downstream task of question answering with samples from 2,321 OpenAPI documentation. We formulate the original question answering problem as a multiple choice task: given a semantic natural language description of an output parameter (question) and the syntax of the input schema (paragraph), the model chooses the input parameter (answer) in the schema that best matches the description. The paper describes the data preparation, tokenization, and fine-tuning process as well as discusses possible applications of our model as part of a recommender system. Furthermore, we evaluate the generalizability and the robustness of our fine-tuned model, with the result that it achieves an accuracy of 81.46% correctly chosen parameters.
The performance and scalability of modern data-intensive systems are limited by massive data movement of growing datasets across the whole memory hierarchy to the CPUs. Such traditional processor-centric DBMS architectures are bandwidth- and latency-bound. Processing-in-Memory (PIM) designs seek to overcome these limitations by integrating memory and processing functionality on the same chip. PIM targets near- or in-memory data processing, leveraging the greater in-situ parallelism and bandwidth.
In this paper, we introduce pimDB and provide an initial comparison of processor-centric and PIM-DBMS approaches under different aspects, such as scalability and parallelism, cache-awareness, or PIM-specific compute/bandwidth tradeoffs. The evaluation is performed end-to-end on a real PIM hardware system from UPMEM.
AI-based prediction and recommender systems are widely used in various industry sectors. However, general acceptance of AI-enabled systems is still widely uninvestigated. Therefore, firstly we conducted a survey with 559 respondents. Findings suggested that AI-enabled systems should be fair, transparent, consider personality traits and perform tasks efficiently. Secondly, we developed a system for the Facial Beauty Prediction (FBP) benchmark that automatically evaluates facial attractiveness. As our previous experiments have proven, these results are usually highly correlated with human ratings. Consequently they also reflect human bias in annotations. An upcoming challenge for scientists is to provide training data and AI algorithms that can withstand distorted information. In this work, we introduce AntiDiscriminationNet (ADN), a superior attractiveness prediction network. We propose a new method to generate an unbiased convolutional neural network (CNN) to improve the fairn ess of machine learning in facial dataset. To train unbiased networks we generate synthetic images and weight training data for anti-discrimination assessments towards different ethnicities. Additionally, we introduce an approach with entropy penalty terms to reduce the bias of our CNN. Our research provides insights in how to train and build fair machine learning models for facial image analysis by minimising implicit biases. Our AntiDiscriminationNet finally outperforms all competitors in the FBP benchmark by achieving a Pearson correlation coefficient of PCC = 0.9601.