OPUS 4 | Search

Advancing mental health diagnostics: AI-based method for depression detection in patient interviews (2023)

Danner, Michael ; Hadžić, Bakir ; Gerhardt, Sophie ; Ludwig, Simon ; Uslu, Irem ; Shao, Peng ; Weber, Thomas ; Shiban, Youssef ; Rätsch, Matthias

In this paper, we present a novel artificial intelligence (AI) application for depression detection, using advanced transformer networks to analyse clinical interviews. By incorporating simulated data to enhance traditional datasets, we overcome limitations in data protection and privacy, consequently improving the model’s performance. Our methodology employs BERT-based models, GPT-3.5, and ChatGPT-4, demonstrating state-of-the-art results in detecting depression from linguistic patterns and contextual information that significantly outperform previous approaches. Utilising the DAIC-WOZ and Extended-DAIC datasets, our study showcases the potential of the proposed application in revolutionising mental health care through early depression detection and intervention. Empirical results from various experiments highlight the efficacy of our approach and its suitability for real-world implementation. Furthermore, we acknowledge the ethical, legal, and social implications of AI in mental health diagnostics. Ultimately, our study underscores the transformative potential of AI in mental health diagnostics, paving the way for innovative solutions that can facilitate early intervention and improve patient outcomes.

Overcome ethnic discrimination with unbiased machine learning for facial data sets (2023)

Danner, Michael ; Hadžić, Bakir ; Radloff, Robert ; Su, Xueping ; Peng, Leping ; Weber, Thomas ; Rätsch, Matthias

AI-based prediction and recommender systems are widely used in various industry sectors. However, general acceptance of AI-enabled systems is still widely uninvestigated. Therefore, firstly we conducted a survey with 559 respondents. Findings suggested that AI-enabled systems should be fair, transparent, consider personality traits and perform tasks efficiently. Secondly, we developed a system for the Facial Beauty Prediction (FBP) benchmark that automatically evaluates facial attractiveness. As our previous experiments have proven, these results are usually highly correlated with human ratings. Consequently they also reflect human bias in annotations. An upcoming challenge for scientists is to provide training data and AI algorithms that can withstand distorted information. In this work, we introduce AntiDiscriminationNet (ADN), a superior attractiveness prediction network. We propose a new method to generate an unbiased convolutional neural network (CNN) to improve the fairn ess of machine learning in facial dataset. To train unbiased networks we generate synthetic images and weight training data for anti-discrimination assessments towards different ethnicities. Additionally, we introduce an approach with entropy penalty terms to reduce the bias of our CNN. Our research provides insights in how to train and build fair machine learning models for facial image analysis by minimising implicit biases. Our AntiDiscriminationNet finally outperforms all competitors in the FBP benchmark by achieving a Pearson correlation coefficient of PCC = 0.9601.

Towards equitable AI in HR: designing a fair, reliable, and transparent human resource management application (2023)

Danner, Michael ; Hadžić, Bakir ; Weber, Thomas ; Xinjuan, Zhu ; Rätsch, Matthias

The aim of this work is the development of artificial intelligence (AI) application to support the recruiting process that elevates the domain of human resource management by advancing its capabilities and effectiveness. This affects recruiting processes and includes solutions for active sourcing, i.e. active recruitment, pre-sorting, evaluating structured video interviews and discovering internal training potential. This work highlights four novel approaches to ethical machine learning. The first is precise machine learning for ethically relevant properties in image recognition, which focuses on accurately detecting and analysing these properties. The second is the detection of bias in training data, allowing for the identification and removal of distortions that could skew results. The third is minimising bias, which involves actively working to reduce bias in machine learning models. Finally, an unsupervised architecture is introduced that can learn fair results even without ground truth data. Together, these approaches represent important steps forward in creating ethical and unbiased machine learning systems.

Evolutional normal maps: 3D face representations for 2D-3D face recognition, face modelling and data augmentation (2022)

Danner, Michael ; Weber, Thomas ; Huber, Patrik ; Awais, Muhammad ; Rätsch, Matthias ; Kittler, Josef

We address the problem of 3D face recognition based on either 3D sensor data, or on a 3D face reconstructed from a 2D face image. We focus on 3D shape representation in terms of a mesh of surface normal vectors. The first contribution of this work is an evaluation of eight different 3D face representations and their multiple combinations. An important contribution of the study is the proposed implementation, which allows these representations to be computed directly from 3D meshes, instead of point clouds. This enhances their computational efficiency. Motivated by the results of the comparative evaluation, we propose a 3D face shape descriptor, named Evolutional Normal Maps, that assimilates and optimises a subset of six of these approaches. The proposed shape descriptor can be modified and tuned to suit different tasks. It is used as input for a deep convolutional network for 3D face recognition. An extensive experimental evaluation using the Bosphorus 3D Face, CASIA 3D Face and JNU-3D Face datasets shows that, compared to the state of the art methods, the proposed approach is better in terms of both computational cost and recognition accuracy.

Ethically aligned deep learning: unbiased facial aesthetic prediction (2021)

Danner, Michael ; Weber, Thomas ; Peng, Leping ; Gerlach, Tobias ; Su, Xueping ; Rätsch, Matthias

Facial beauty prediction (FBP) aims to develop a machine that automatically makes facial attractiveness assessment. In the past those results were highly correlated with human ratings, therefore also with their bias in annotating. As artificial intelligence can have racist and discriminatory tendencies, the cause of skews in the data must be identified. Development of training data and AI algorithms that are robust against biased information is a new challenge for scientists. As aesthetic judgement usually is biased, we want to take it one step further and propose an Unbiased Convolutional Neural Network for FBP. While it is possible to create network models that can rate attractiveness of faces on a high level, from an ethical point of view, it is equally important to make sure the model is unbiased. In this work, we introduce AestheticNet, a state-of-the-art attractiveness prediction network, which significantly outperforms competitors with a Pearson Correlation of 0.9601. Additionally, we propose a new approach for generating a bias-free CNN to improve fairness in machine learning.

Evaluating body movement and breathing signals for identification of sleep/wake states (2022)

Gaiduk, Maksym ; Seepold, Ralf ; Martínez Madrid, Natividad ; Penzel, Thomas ; Weber, Lucas ; Conti, Massimo ; Orcioni, Simone ; Ortega, Juan Antonio

Recognition of sleep and wake states is one of the relevant parts of sleep analysis. Performing this measurement in a contactless way increases comfort for the users. We present an approach evaluating only movement and respiratory signals to achieve recognition, which can be measured non-obtrusively. The algorithm is based on multinomial logistic regression and analyses features extracted out of mentioned above signals. These features were identified and developed after performing fundamental research on characteristics of vital signals during sleep. The achieved accuracy of 87% with the Cohen’s kappa of 0.40 demonstrates the appropriateness of a chosen method and encourages continuing research on this topic.

Fast and robust RGB-D scene labeling for autonomous driving (2018)

Jasch, Manuel ; Weber, Thomas ; Rätsch, Matthias

For autonomously driving cars and intelligent vehicles it is crucial to understand the scene context including objects in the surrounding. A fundamental technique accomplishing this is scene labeling. That is, assigning a semantic class to each pixel in a scene image. This task is commonly tackled quite well by fully convolutional neural networks (FCN). Crucial factors are a small model size and a low execution time. This work presents the first method that exploits depth cues together with confidence estimates in a CNN. To this end, novel experimentally grounded network architecture is proposed to perform robust scene labeling that does not require costly preprocessing like CRFs or LSTMs as commonly used in related work. The effectiveness of this approach is demonstrated in an extensive evaluation on a challenging real-world dataset. The new architecture is highly optimized for high accuracy and low execution time.

Multimodal neural networks: RGB-D for semantic segmentation and object detection (2017)

Schneider, Lukas ; Jasch, Manuel ; Fröhlich, Björn ; Weber, Thomas ; Franke, Uwe ; Pollefeys, Marc ; Rätsch, Matthias

This paper presents a novel multi-modal CNN architecture that exploits complementary input cues in addition to sole color information. The joint model implements a mid-level fusion that allows the network to exploit cross modal interdependencies already on a medium feature-level. The benefit of the presented architecture is shown for the RGB-D image understanding task. So far, state-of-the-art RGB-D CNNs have used network weights trained on color data. In contrast, a superior initialization scheme is proposed to pre-train the depth branch of the multi-modal CNN independently. In an end-to-end training the network parameters are optimized jointly using the challenging Cityscapes dataset. In thorough experiments, the effectiveness of the proposed model is shown. Both, the RGB GoogLeNet and further RGB-D baselines are outperformed with a significant margin on two different tasks: semantic segmentation and object detection. For the latter, this paper shows how to extract object level groundtruth from the instance level annotations in Cityscapes in order to train a powerful object detector.

Methodology to analyze the accuracy of 3D objects reconstructed with collaborative robot based monocular LSD-SLAM (2018)

Triputen, Sergey ; Gopal, Atmaraaj ; Weber, Thomas ; Höfert, Christian ; Rätsch, Matthias ; Schreve, Kristiaan

SLAM systems are mainly applied for robot navigation while research on feasibility for motion planning with SLAM for tasks like bin-picking, is scarce. Accurate 3D reconstruction of objects and environments is important for planning motion and computing optimal gripper pose to grasp objects. In this work, we propose the methods to analyze the accuracy of a 3D environment reconstructed using a LSD-SLAM system with a monocular camera mounted onto the gripper of a collaborative robot. We discuss and propose a solution to the pose space conversion problem. Finally, we present several criteria to analyze the 3D reconstruction accuracy. These could be used as guidelines to improve the accuracy of 3D reconstructions with monocular LSD-SLAM and other SLAM based solutions.

Enhancing current cardiorespiratory-based approaches of sleep stage classification by temporal feature stacking (2021)

Weber, Lucas ; Gaiduk, Maksym ; Seepold, Ralf ; Martínez Madrid, Natividad ; Glos, Martin ; Penzel, Thomas

This paper presents a generic method to enhance performance and incorporate temporal information for cardiorespiratory-based sleep stage classification with a limited feature set and limited data. The classification algorithm relies on random forests and a feature set extracted from long-time home monitoring for sleep analysis. Employing temporal feature stacking, the system could be significantly improved in terms of Cohen’s κ and accuracy. The detection performance could be improved for three classes of sleep stages (Wake, REM, Non-REM sleep), four classes (Wake, Non-REM-Light sleep, Non-REM Deep sleep, REM sleep), and five classes (Wake, N1, N2, N3/4, REM sleep) from a κ of 0.44 to 0.58, 0.33 to 0.51, and 0.28 to 0.44 respectively by stacking features before and after the epoch to be classified. Further analysis was done for the optimal length and combination method for this stacking approach. Overall, three methods and a variable duration between 30 s and 30 min have been analyzed. Overnight recordings of 36 healthy subjects from the Interdisciplinary Center for Sleep Medicine at Charité-Universitätsmedizin Berlin and Leave-One-Out-Cross-Validation on a patient-level have been used to validate the method.

Author(s)
Title
Additional person(s)
Publisher
Supervisor(s)
Abstract
Full text

Open Access

Refine

Author

Year of publication

Document Type

Language

Has full text

Is part of the Bibliography

Institute

Publisher

13 search hits