OPUS 4 | Search

GAN-powered model- & landmark-free reconstruction: a versatile approach for high-quality 3D facial and object recovery from single images (2023)

Danner, Michael ; Huber, Patrik ; Awais, Muhammad ; Rätsch, Matthias ; Kittler, Josef

In recent years, 3D facial reconstructions from single images have garnered significant interest. Most of the approaches are based on 3D Morphable Model (3DMM) fitting to reconstruct the 3D face shape. Concurrently, the adoption of Generative Adversarial Networks (GAN) has been gaining momentum to improve the texture of reconstructed faces. In this paper, we propose a fundamentally different approach to reconstructing the 3D head shape from a single image by harnessing the power of GAN. Our method predicts three maps of normal vectors of the head’s frontal, left, and right poses. We are thus presenting a model-free method that does not require any prior knowledge of the object’s geometry to be reconstructed. The key advantage of our proposed approach is the substantial improvement in reconstruction quality compared to existing methods, particularly in the case of facial regions that are self-occluded in the input image. Our method is not limited to 3d face reconstruction. It is generic and applicable to multiple kinds of 3D objects. To illustrate the versatility of our method, we demonstrate its efficacy in reconstructing the entire human body. By delivering a model-free method capable of generating high-quality 3D reconstructions, this paper not only advances the field of 3D facial reconstruction but also provides a foundation for future research and applications spanning multiple object types. The implications of this work have the potential to extend far beyond facial reconstruction, paving the way for innovative solutions and discoveries in various domains.

Towards equitable AI in HR: designing a fair, reliable, and transparent human resource management application (2023)

Danner, Michael ; Hadžić, Bakir ; Weber, Thomas ; Xinjuan, Zhu ; Rätsch, Matthias

The aim of this work is the development of artificial intelligence (AI) application to support the recruiting process that elevates the domain of human resource management by advancing its capabilities and effectiveness. This affects recruiting processes and includes solutions for active sourcing, i.e. active recruitment, pre-sorting, evaluating structured video interviews and discovering internal training potential. This work highlights four novel approaches to ethical machine learning. The first is precise machine learning for ethically relevant properties in image recognition, which focuses on accurately detecting and analysing these properties. The second is the detection of bias in training data, allowing for the identification and removal of distortions that could skew results. The third is minimising bias, which involves actively working to reduce bias in machine learning models. Finally, an unsupervised architecture is introduced that can learn fair results even without ground truth data. Together, these approaches represent important steps forward in creating ethical and unbiased machine learning systems.

Investigation of tympanic membrane influences on middle-ear impedance measurements and simulations (2020)

Sackmann, Benjamin ; Warnholtz, Birthe ; Sim, Jae Hoon ; Burovikhin, Dmitrii ; Dalhoff, Ernst ; Eberhard, Peter ; Lauxmann, Michael

This study simulates acoustic impedance measurements in the human ear canal and investigates error influences due to improperly accounted evanescence in the probe’s near field, cross-section area changes, curvature of the ear canal, and pressure inhomogeneities across the tympanic membrane, which arise mainly at frequencies above 10 kHz. Evanescence results from strongly damped modes of higher order, which can only be found in the near field of the sound source and are excited due to sharp cross-sectional changes as they occur at the transition from the probe loudspeaker to the ear canal. This means that different impedances are measured depending on the probe design. The influence of evanescence cannot be eliminated completely from measurements, however, it can be reduced by a probe design with larger distance between speaker and microphone. A completely different approach to account for the influence of evanescence is to evaluate impedance measurements with the help of a finite element model, which takes the precise arrangement of microphone and speaker in the measurement into account. The latter is shown in this study exemplary on impedance measurements at a tube terminated with a steel plate. Furthermore, the influences of shape changes of the tympanic membrane and ear canal curvature on impedance are investigated.

Investigation of inhomogeneous stiffness and damping characteristics of the human stapedial annular ligament (2019)

Burovikhin, Dmitrii ; Sackmann, Benjamin ; Schär, Merlin ; Sim, Jae Hoon ; Eberhard, Peter ; Lauxmann, Michael

This study describes a non-contact measuring and system identification procedure for evaluating inhomogeneous stiffness and damping characteristics of the annular ligament in the physiological amplitude and frequency range without the application of large static external forces that can cause unnatural displacements of the stapes. To verify the procedure, measurements were first conducted on a steel beam. Then, measurements on an individual human cadaveric temporal bone sample were performed. The estimated results support the inhomogeneous stiffness and damping distribution of the annular ligament and are in a good agreement with the multiphoton microscopy results which show that the posterior-inferior corner of the stapes footplate is the stiffest region of the annular ligament.

Efficient and robust 3D object reconstruction based on monocular SLAM and CNN semantic segmentation (2019)

Weber, Thomas ; Triputen, Sergey ; Gopal, Atmaraaj ; Eißler, Steffen ; Höfert, Christian ; Schreve, Kristiaan ; Rätsch, Matthias

Various applications implement slam technology, especially in the field of robot navigation. We show the advantage of slam technology for independent 3d object reconstruction. To receive a point cloud of every object of interest void of its environment, we leverage deep learning. We utilize recent cnn deep learning research for accurate semantic segmentation of objects. In this work, we propose two fusion methods for cnn-based semantic segmentation and slam for the 3d reconstruction of objects of interest in order to obtain a more robustness and efficiency. As a major novelty, we introduce a cnn-based masking to focus slam only on feature points belonging to every single object. Noisy, complex or even non-rigid features in the background are filtered out, improving the estimation of the camera pose and the 3d point cloud of each object. Our experiments are constrained to the reconstruction of industrial objects. We present an analysis of the accuracy and performance of each method and compare the two methods describing their pros and cons.

Multimodal neural networks: RGB-D for semantic segmentation and object detection (2017)

Schneider, Lukas ; Jasch, Manuel ; Fröhlich, Björn ; Weber, Thomas ; Franke, Uwe ; Pollefeys, Marc ; Rätsch, Matthias

This paper presents a novel multi-modal CNN architecture that exploits complementary input cues in addition to sole color information. The joint model implements a mid-level fusion that allows the network to exploit cross modal interdependencies already on a medium feature-level. The benefit of the presented architecture is shown for the RGB-D image understanding task. So far, state-of-the-art RGB-D CNNs have used network weights trained on color data. In contrast, a superior initialization scheme is proposed to pre-train the depth branch of the multi-modal CNN independently. In an end-to-end training the network parameters are optimized jointly using the challenging Cityscapes dataset. In thorough experiments, the effectiveness of the proposed model is shown. Both, the RGB GoogLeNet and further RGB-D baselines are outperformed with a significant margin on two different tasks: semantic segmentation and object detection. For the latter, this paper shows how to extract object level groundtruth from the instance level annotations in Cityscapes in order to train a powerful object detector.

A 3D face modelling approach for pose-invariant face recognition in a human-robot environment (2017)

Grupp, Michael ; Kopp, Philipp ; Huber, Patrik ; Rätsch, Matthias

Face analysis techniques have become a crucial component of human-machine interaction in the fields of assistive and humanoid robotics. However, the variations in head-pose that arise naturally in these environments are still a great challenge. In this paper, we present a real-time capable 3D face modelling framework for 2D in-the-wild images that is applicable for robotics. The fitting of the 3D Morphable Model is based exclusively on automatically detected landmarks. After fitting, the face can be corrected in pose and transformed back to a frontal 2D representation that is more suitable for face recognition. We conduct face recognition experiments with non-frontal images from the MUCT database and uncontrolled, in the wild images from the PaSC database, the most challenging face recognition database to date, showing an improved performance. Finally, we present our SCITOS G5 robot system, which incorporates our framework as a means of image pre-processing for face analysis.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has full text

Is part of the Bibliography

Institute

Publisher

7 search hits