OPUS 4 | Search

Overcome ethnic discrimination with unbiased machine learning for facial data sets (2023)

Danner, Michael ; Hadžić, Bakir ; Radloff, Robert ; Su, Xueping ; Peng, Leping ; Weber, Thomas ; Rätsch, Matthias

AI-based prediction and recommender systems are widely used in various industry sectors. However, general acceptance of AI-enabled systems is still widely uninvestigated. Therefore, firstly we conducted a survey with 559 respondents. Findings suggested that AI-enabled systems should be fair, transparent, consider personality traits and perform tasks efficiently. Secondly, we developed a system for the Facial Beauty Prediction (FBP) benchmark that automatically evaluates facial attractiveness. As our previous experiments have proven, these results are usually highly correlated with human ratings. Consequently they also reflect human bias in annotations. An upcoming challenge for scientists is to provide training data and AI algorithms that can withstand distorted information. In this work, we introduce AntiDiscriminationNet (ADN), a superior attractiveness prediction network. We propose a new method to generate an unbiased convolutional neural network (CNN) to improve the fairn ess of machine learning in facial dataset. To train unbiased networks we generate synthetic images and weight training data for anti-discrimination assessments towards different ethnicities. Additionally, we introduce an approach with entropy penalty terms to reduce the bias of our CNN. Our research provides insights in how to train and build fair machine learning models for facial image analysis by minimising implicit biases. Our AntiDiscriminationNet finally outperforms all competitors in the FBP benchmark by achieving a Pearson correlation coefficient of PCC = 0.9601.

GAN-powered model- & landmark-free reconstruction: a versatile approach for high-quality 3D facial and object recovery from single images (2023)

Danner, Michael ; Huber, Patrik ; Awais, Muhammad ; Rätsch, Matthias ; Kittler, Josef

In recent years, 3D facial reconstructions from single images have garnered significant interest. Most of the approaches are based on 3D Morphable Model (3DMM) fitting to reconstruct the 3D face shape. Concurrently, the adoption of Generative Adversarial Networks (GAN) has been gaining momentum to improve the texture of reconstructed faces. In this paper, we propose a fundamentally different approach to reconstructing the 3D head shape from a single image by harnessing the power of GAN. Our method predicts three maps of normal vectors of the head’s frontal, left, and right poses. We are thus presenting a model-free method that does not require any prior knowledge of the object’s geometry to be reconstructed. The key advantage of our proposed approach is the substantial improvement in reconstruction quality compared to existing methods, particularly in the case of facial regions that are self-occluded in the input image. Our method is not limited to 3d face reconstruction. It is generic and applicable to multiple kinds of 3D objects. To illustrate the versatility of our method, we demonstrate its efficacy in reconstructing the entire human body. By delivering a model-free method capable of generating high-quality 3D reconstructions, this paper not only advances the field of 3D facial reconstruction but also provides a foundation for future research and applications spanning multiple object types. The implications of this work have the potential to extend far beyond facial reconstruction, paving the way for innovative solutions and discoveries in various domains.

Towards equitable AI in HR: designing a fair, reliable, and transparent human resource management application (2023)

Danner, Michael ; Hadžić, Bakir ; Weber, Thomas ; Xinjuan, Zhu ; Rätsch, Matthias

The aim of this work is the development of artificial intelligence (AI) application to support the recruiting process that elevates the domain of human resource management by advancing its capabilities and effectiveness. This affects recruiting processes and includes solutions for active sourcing, i.e. active recruitment, pre-sorting, evaluating structured video interviews and discovering internal training potential. This work highlights four novel approaches to ethical machine learning. The first is precise machine learning for ethically relevant properties in image recognition, which focuses on accurately detecting and analysing these properties. The second is the detection of bias in training data, allowing for the identification and removal of distortions that could skew results. The third is minimising bias, which involves actively working to reduce bias in machine learning models. Finally, an unsupervised architecture is introduced that can learn fair results even without ground truth data. Together, these approaches represent important steps forward in creating ethical and unbiased machine learning systems.

Advancing mental health diagnostics: AI-based method for depression detection in patient interviews (2023)

Danner, Michael ; Hadžić, Bakir ; Gerhardt, Sophie ; Ludwig, Simon ; Uslu, Irem ; Shao, Peng ; Weber, Thomas ; Shiban, Youssef ; Rätsch, Matthias

In this paper, we present a novel artificial intelligence (AI) application for depression detection, using advanced transformer networks to analyse clinical interviews. By incorporating simulated data to enhance traditional datasets, we overcome limitations in data protection and privacy, consequently improving the model’s performance. Our methodology employs BERT-based models, GPT-3.5, and ChatGPT-4, demonstrating state-of-the-art results in detecting depression from linguistic patterns and contextual information that significantly outperform previous approaches. Utilising the DAIC-WOZ and Extended-DAIC datasets, our study showcases the potential of the proposed application in revolutionising mental health care through early depression detection and intervention. Empirical results from various experiments highlight the efficacy of our approach and its suitability for real-world implementation. Furthermore, we acknowledge the ethical, legal, and social implications of AI in mental health diagnostics. Ultimately, our study underscores the transformative potential of AI in mental health diagnostics, paving the way for innovative solutions that can facilitate early intervention and improve patient outcomes.

Pre‑training neural machine translation with alignment information via optimal transport (2023)

Su, Xueping ; Zhao, Xingkai ; Ren, Jie ; Li, Yunhong ; Rätsch, Matthias

With the rapid development of globalization, the demand for translation between different languages is also increasing. Although pre-training has achieved excellent results in neural machine translation, the existing neural machine translation has almost no high-quality suitable for specific fields. Alignment information, so this paper proposes a pre-training neural machine translation with alignment information via optimal transport. First, this paper narrows the representation gap between different languages by using OTAP to generate domain-specific data for information alignment, and learns richer semantic information. Secondly, this paper proposes a lightweight model DR-Reformer, which uses Reformer as the backbone network, adds Dropout layers and Reduction layers, reduces model parameters without losing accuracy, and improves computational efficiency. Experiments on the Chinese and English datasets of AI Challenger 2018 and WMT-17 show that the proposed algorithm has better performance than existing algorithms.

Approach for Digitising the Softness of Human Tissue for Implementation in 3D Soft Avatar Clothing Simulations (2023)

Brake, Elena Alida ; Danner, Michael ; Kosel, Gabriela ; Kyosev, Yordan ; Rätsch, Matthias ; Cebula, Holger ; Rose, Katerina

Patterns are virtually simulated in 3D CAD programs before production to check the fit. However, achieving lifelike representations of human avatars, especially regarding soft tissue dynamics, remains challenging. This is mainly since conventional avatars in garment CAD programs are simulated with a continuous hard surface and not corresponding to the human physical and mechanical body properties of soft tissue. In the real world, the human body’s natural shape is affected by the contact pressure of tight-fitting textiles. To verify the fit of a simulated garment, the interactions between the individual body shape and the garment must be considered. This paper introduces an innovative approach to digitising the softness of human tissue using 4D scanning technology. The primary objective of this research is to explore the interactions between tissue softness and different compression levels of apparel, exerting pressure on the tissue to capture the changes in the natural shape. Therefore, to generate data and model an avatar with soft body physics, it is essential to capture the deform ability and elasticity of the soft tissue and map it into the modification options for a simulation. To aim this, various methods from different fields were researched and compared to evaluate 4D scanning as the most suitable method for capturing tissue deformability in vivo. In particular, it should be considered that the human body has different deformation capabilities depending on age, the amount of muscle and body fat. In addition, different tissue zones have different mechanical properties, so it is essential to identify and classify them to back up these properties for the simulation. It has been shown that by digitising the obtained data of the different defined applied pressure levels, a prediction of the deformation of the tissue of the exact person becomes possible. As technology advances and data sets grow, this approach has the potential to reshape how we verify fit digitally with soft avatars and leverage their realistic soft tissue properties for various practical purposes.

Simulating temporally and spatially correlated wind speed time series by spectral representation method (2023)

Xiao, Qing ; Wu, Lianghong ; Wu, Xiaowen ; Rätsch, Matthias

In this paper, it aims to model wind speed time series at multiple sites. The five-parameter Johnson distribution is deployed to relate the wind speed at each site to a Gaussian time series, and the resultant m-dimensional Gaussian stochastic vector process Z(t) is employed to model the temporal-spatial correlation of wind speeds at m different sites. In general, it is computationally tedious to obtain the autocorrelation functions (ACFs) and cross-correlation functions (CCFs) of Z(t), which are different to those of wind speed times series. In order to circumvent this correlation distortion problem, the rank ACF and rank CCF are introduced to characterize the temporal-spatial correlation of wind speeds, whereby the ACFs and CCFs of Z(t) can be analytically obtained. Then, Fourier transformation is implemented to establish the cross-spectral density matrix of Z(t), and an analytical approach is proposed to generate samples of wind speeds at m different sites. Finally, simulation experiments are performed to check the proposed methods, and the results verify that the five-parameter Johnson distribution can accurately match distribution functions of wind speeds, and the spectral representation method can well reproduce the temporal-spatial correlation of wind speeds.

TFCSG: An Unsupervised Approach for Question-retrieval Over Multi-task Learning (2023)

Aiguo, Shang ; Danner, Michael ; Xinjuan, Zhu ; Rätsch, Matthias

Most Question-answering (QA) systems rely on training data to reach their optimal performance. However, acquiring training data for supervised systems is both time-consuming and resource-intensive. To address this, in this paper, we propose TFCSG, an unsupervised similar question retrieval approach that leverages pre-trained language models and multi-task learning. Firstly, topic keywords in question sentences are extracted sequentially based on a latent topic-filtering algorithm to construct unsupervised training corpus data. Then, the multi-task learning method is used to build the question retrieval model. There are three tasks designed. The first is a short sentence contrastive learning task. The second is the question sentence and its corresponding topic sequence similarity judgment task. The third is using question sentences to generate their corresponding topic sequence task. The three tasks are used to train the language model in parallel. Finally, similar questions are obtained by calculating the cosine similarity between sentence vectors. The comparison experiment on public question datasets that TFCSG outperforms the comparative unsupervised baseline method. And there is no need for manual marking, which greatly saves human resources.

Evaluation of low-cost 3D scanner hardware for clothing industry (2023)

Danner, Michael ; Brake, Elena ; Decker, Christian ; Rätsch, Matthias ; Kyosev, Yordan ; Rose, Katerina

In recent years, the demand for accurate and efficient 3D body scanning technologies has increased, driven by the growing interest in personalised textile development and health care. This position paper presents the implementation of a novel 3D body scanner that integrates multiple RGB cameras and image stitching techniques to generate detailed point clouds and 3D mesh models. Our system significantly enhances the scanning process, achieving higher resolution and fidelity while reducing the cost, time and effort required for data acquisition and processing. Furthermore, we evaluate the potential use cases and applications of our 3D body scanner, focusing on the textile technology and health sectors. In textile development, the 3D scanner contributes to bespoke clothing production, allowing designers to construct made-to-measure garments, thus minimising waste and enhancing customer satisfaction through fitting clothing. In mental health care, the 3D body scanner can be employed as a tool for body image analysis, providing valuable insights into the psychological and emotional aspects of self-perception. By exploring the synergy between the 3D body scanner and these fields, we aim to foster interdisciplinary collaborations that drive advancements in personalisation, sustainability, and well-being.

Personalized clothing recommendation fusing the 4-season color system and users’ biological characteristics (2023)

Su, Xueping ; Duan, Jiawei ; Li, Yunhong ; Danner, Michael ; Rätsch, Matthias ; Peng, Jinye

In clothing e-commerce, the challenge of optimally recommending clothing that suits a user’s unique characteristics remains a pressing issue. Many platforms simply recommend best-selling or popular clothing, without taking into account important attributes like user’s face color, pupil color, face shape, age, etc. To solve this problem, this paper proposes a personalized clothing recommendation algorithm that incorporates the established 4-Season Color System and user-specific biological characteristics. Firstly, the attributes and colors of clothing are classified by Fnet network, that can learn disjoint label combinations and mitigate the issue of excessive labels. Secondly, on the basis of the 4-Season Color System, the user’s face color model is trained by combined MobileNetV3_DTL, which ensures the model’s generalization and improves the training speed. Thirdly, user’s face shape and age are divided into different categories by an Inception network. Finally, according to the users’ face color, age, face shape and other information, personalized clothing is recommended in a coarse-to-fine manner. Experiments on five datasets demonstrate that the algorithm proposed in this paper achieves state-of-the-art results.

Evolutional normal maps: 3D face representations for 2D-3D face recognition, face modelling and data augmentation (2022)

Danner, Michael ; Weber, Thomas ; Huber, Patrik ; Awais, Muhammad ; Rätsch, Matthias ; Kittler, Josef

We address the problem of 3D face recognition based on either 3D sensor data, or on a 3D face reconstructed from a 2D face image. We focus on 3D shape representation in terms of a mesh of surface normal vectors. The first contribution of this work is an evaluation of eight different 3D face representations and their multiple combinations. An important contribution of the study is the proposed implementation, which allows these representations to be computed directly from 3D meshes, instead of point clouds. This enhances their computational efficiency. Motivated by the results of the comparative evaluation, we propose a 3D face shape descriptor, named Evolutional Normal Maps, that assimilates and optimises a subset of six of these approaches. The proposed shape descriptor can be modified and tuned to suit different tasks. It is used as input for a deep convolutional network for 3D face recognition. An extensive experimental evaluation using the Bosphorus 3D Face, CASIA 3D Face and JNU-3D Face datasets shows that, compared to the state of the art methods, the proposed approach is better in terms of both computational cost and recognition accuracy.

Semantic risk-aware costmaps for robots in industrial applications using deep learning on abstracted safety classes from synthetic data (2022)

Weber, Thomas ; Danner, Michael ; Zhang, Bo ; Rätsch, Matthias ; Zell, Andreas

For collision and obstacle avoidance as well as trajectory planning, robots usually generate and use a simple 2D costmap without any semantic information about the detected obstacles. Thus a robot’s path planning will simply adhere to an arbitrarily large safety margin around obstacles. A more optimal approach is to adjust this safety margin according to the class of an obstacle. For class prediction, an image processing convolutional neural network can be trained. One of the problems in the development and training of any neural network is the creation of a training dataset. The first part of this work describes methods and free open source software, allowing a fast generation of annotated datasets. Our pipeline can be applied to various objects and environment settings and is extremely easy to use to anyone for synthesising training data from 3D source data. We create a fully synthetic industrial environment dataset with 10 k physically-based rendered images and annotations. Our da taset and sources are publicly available at https://github.com/LJMP/synthetic-industrial-dataset. Subsequently, we train a convolutional neural network with our dataset for costmap safety class prediction. We analyse different class combinations and show that learning the safety classes end-to-end directly with a small dataset, instead of using a class lookup table, improves the quantity and precision of the predictions.

Automatic identification of focus personage in multi-lingual news images (2021)

Su, Xueping ; Zhu, Danyao ; Ren, Jie ; Rätsch, Matthias

Annotations of character IDs in news images are critical as ground truth for news retrieval and recommendation system. Universality and accuracy optimization of deep neural network models constitutes the key technology to improve the precision and computing efficiency of automatic news character identification, which is attracting increased attention globally. This paper explores the optimized deep neural network model for automatic focus personage identification in multi-lingual news. First, the face model of the focus personage is trained by using the corresponding face images from German news as positive samples. Next, the scheme of Recurrent Convolutional Neural Network (RCNN) + Bi-directional Long-Short Term Memory (Bi-LSTM) + Conditional Random Field (CRF) is utilized to label the focus name, and the RCNN-RCNN encoder–decoder is applied to translate names of people into multiple languages. Third, face features are described by combining the advantages of Local Gabor Binary Pattern Histogram Sequence (LGBPHS) and RCNN, and iterative quantization (ITQ) is used to binarize codes. Finally, a name semantic network is built for different domains. Experiments are performed on a dataset which comprises approximately 100,000 news images. The experimental results demonstrate that the proposed method achieves a significant improvement over other algorithms.

Design and experiment of permanent magnet tubular linear generator using mechanical vertical vibration energy (2021)

Zhang, Bo ; Yang, Yongbao ; Feng, Zhi ; Suo, Yuchao ; Rätsch, Matthias

This paper presents a permanent magnet tubular linear generator system for powering passive sensors using vertical vibration harvesting energy. The system consists of a permanent magnet tubular linear vibration generator and electric circuits. By using the design of mechanical resonant movers, the generator is capable of converting low frequencies small amplitude vertical vibration energy into more regular sinusoidal electrical energy. The distribution of the magnetic field and electromotive force are calculated by Finite Element Analysis. The characteristics of the linear vibration generator system are observed. The experimental results show the generator can produce about 0.4W~1.6W electrical power when the vibration source's amplitude is fixed on 2mm and the frequencies are between 13Hz and 22Hz.

Deep learning-based EEG detection of mental alertness states from drivers under ethical aspects (2021)

Rohlinger, Tihomir ; Peng, Le Ping ; Gerlach, Tobias ; Pasler, Paul ; Zhang, Bo ; Seepold, Ralf ; Martínez Madrid, Natividad ; Rätsch, Matthias

One of the most critical factors for a successful road trip is a high degree of alertness while driving. Even a split second of inattention or sleepiness in a crucial moment, will make the difference between life and death. Several prestigious car manufacturers are currently pursuing the aim of automated drowsiness identification to resolve this problem. The path between neuro-scientific research in connection with artificial intelligence and the preservation of the dignity of human individual’s and its inviolability, is very narrow. The key contribution of this work is a system of data analysis for EEGs during a driving session, which draws on previous studies analyzing heart rate (ECG), brain waves (EEG), and eye function (EOG). The gathered data is hereby treated as sensitive as possible, taking ethical regulations into consideration. Obtaining evaluable signs of evolving exhaustion includes techniques that obtain sleeping stage frequencies, problematic are hereby the correlated interference’s in the signal. This research focuses on a processing chain for EEG band splitting that involves band-pass filtering, principal component analysis (PCA), independent component analysis (ICA) with automatic artefact severance, and fast fourier transformation (FFT). The classification is based on a step-by-step adaptive deep learning analysis that detects theta rhythms as a drowsiness predictor in the pre-processed data. It was possible to obtain an offline detection rate of 89% and an online detection rate of 73%. The method is linked to the simulated driving scenario for which it was developed. This leaves space for more optimization on laboratory methods and data collection during wakefulness-dependent operations.

Deep adversarial domain adaptation model for bearing fault diagnosis (2021)

Liu, Zhao-Hua ; Lu, Bi-Liang ; Wei, Hua-Liang ; Rätsch, Matthias

Fault diagnosis of rolling bearings is an essential process for improving the reliability and safety of the rotating machinery. It is always a major challenge to ensure fault diag- nosis accuracy in particular under severe working conditions. In this article, a deep adversarial domain adaptation (DADA) model is proposed for rolling bearing fault diagnosis. This model con- structs an adversarial adaptation network to solve the commonly encountered problem in numerous real applications: the source domain and the target domain are inconsistent in their distribution. First, a deep stack autoencoder (DSAE) is combined with representative feature learning for dimensionality reduction, and such a combination provides an unsupervised learning method to effectively acquire fault features. Meanwhile, domain adaptation and recognition classification are implemented using a Softmax classifier to augment classification accuracy. Second, the effects of the number of hidden layers in the stack autoencoder network, the number of neurons in each hidden layer, and the hyperparameters of the proposed fault diagnosis algorithm are analyzed. Third, comprehensive analysis is performed on real data to vali- date the performance of the proposed method; the experimental results demonstrate that the new method outperforms the existing machine learning and deep learning methods, in terms of classification accuracy and generalization ability.

Ethically aligned deep learning: unbiased facial aesthetic prediction (2021)

Danner, Michael ; Weber, Thomas ; Peng, Leping ; Gerlach, Tobias ; Su, Xueping ; Rätsch, Matthias

Facial beauty prediction (FBP) aims to develop a machine that automatically makes facial attractiveness assessment. In the past those results were highly correlated with human ratings, therefore also with their bias in annotating. As artificial intelligence can have racist and discriminatory tendencies, the cause of skews in the data must be identified. Development of training data and AI algorithms that are robust against biased information is a new challenge for scientists. As aesthetic judgement usually is biased, we want to take it one step further and propose an Unbiased Convolutional Neural Network for FBP. While it is possible to create network models that can rate attractiveness of faces on a high level, from an ethical point of view, it is equally important to make sure the model is unbiased. In this work, we introduce AestheticNet, a state-of-the-art attractiveness prediction network, which significantly outperforms competitors with a Pearson Correlation of 0.9601. Additionally, we propose a new approach for generating a bias-free CNN to improve fairness in machine learning.

Who loves virtue as much as he loves beauty?: Deep learning based estimator for aesthetics of portraits (2020)

Gerlach, Tobias ; Danner, Michael ; Peng, Le ; Kaminickas, Aidas ; Fei, Wu ; Rätsch, Matthias

”I have never seen one who loves virtue as much as he loves beauty,” Confucius once said. If beauty is more important as goodness, it becomes clear why people invest so much effort in their first impression. The aesthetic of faces has many aspects and there is a strong correlation to all characteristics of humans, like age and gender. Often, research on aesthetics by social and ethic scientists lacks sufficient labelled data and the support of machine vision tools. In this position paper we propose the Aesthetic-Faces dataset, containing training data which is labelled by Chinese and German annotators. As a combination of three image subsets, the AF-dataset consists of European, Asian and African people. The research communities in machine learning, aesthetics and social ethics can benefit from our dataset and our toolbox. The toolbox provides many functions for machine learning with state-of-the-art CNNs and an Extreme-Gradient-Boosting regressor, but also 3D Morphable Model technolo gies for face shape evaluation and we discuss how to train an aesthetic estimator considering culture and ethics.

Mobile-Unet: An efficient convolutional neural network for fabric defect detection (2020)

Jing, Junfeng ; Wang, Zhen ; Rätsch, Matthias ; Zhang, Huanhuan

Deep learning-based fabric defect detection methods have been widely investigated to improve production efficiency and product quality. Although deep learning-based methods have proved to be powerful tools for classification and segmentation, some key issues remain to be addressed when applied to real applications. Firstly, the actual fabric production conditions of factories necessitate higher real-time performance of methods. Moreover, fabric defects as abnormal samples are very rare compared with normal samples, which results in data imbalance. It makes model training based on deep learning challenging. To solve these problems, an extremely efficient convolutional neural network, Mobile-Unet, is proposed to achieve the end-to-end defect segmentation. The median frequency balancing loss function is used to overcome the challenge of sample imbalance. Additionally, Mobile-Unet introduces depth-wise separable convolution, which dramatically reduces the complexity cost and model size of the network. It comprises two parts: encoder and decoder. The MobileNetV2 feature extractor is used as the encoder, and then five deconvolution layers are added as the decoder. Finally, the softmax layer is used to generate the segmentation mask. The performance of the proposed model has been evaluated by public fabric datasets and self-built fabric datasets. In comparison with other methods, the experimental results demonstrate that segmentation accuracy and detection speed in the proposed method achieve state-of-the-art performance.

Personalized clothing recommendation based on user emotional analysis (2020)

Su, Xueping ; Gao, Meng ; Ren, Jie ; Li, Yunhong ; Rätsch, Matthias

With the continuous development of economy, consumers pay more attention to the demand for personalization clothing. However, the recommendation quality of the existing clothing recommendation system is not enough to meet the user’s needs. When browsing online clothing, facial expression is the salient information to understand the user’s preference. In this paper, we propose a novel method to automatically personalize clothing recommendation based on user emotional analysis. Firstly, the facial expression is classified by multiclass SVM. Next, the user’s multi-interest value is calculated using expression intensity that is obtained by hybrid RCNN. Finally, the multi-interest value is fused to carry out personalized recommendation. The experimental results show that the proposed method achieves a significant improvement over other algorithms.

Author(s)
Title
Additional person(s)
Publisher
Supervisor(s)
Abstract
Full text

Open Access

Refine

Author

Year of publication

Document Type

Language

Has full text

Is part of the Bibliography

Institute

Publisher

41 search hits