Refine
Document Type
- Conference proceeding (19)
- Journal article (9)
Has full text
- yes (28)
Is part of the Bibliography
- yes (28)
Institute
- Informatik (24)
- Technik (3)
- Life Sciences (1)
Publisher
- IEEE (12)
- De Gruyter (3)
- Association for Computing Machinery (2)
- Springer (2)
- ARVO (1)
- Deutsche Gesellschaft für Computer- und Roboterassistierte Chirurgie e.V. (1)
- MDPI (1)
- PLOS (1)
- Sage Publishing (1)
- Taylor & Francis (1)
Reliable and accurate car driver head pose estimation is an important function for the next generation of advanced driver assistance systems that need to consider the driver state in their analysis. For optimal performance, head pose estimation needs to be non-invasive, calibration-free and accurate for varying driving and illumination conditions. In this pilot study we investigate a 3D head pose estimation system that automatically fits a statistical 3D face model to measurements of a driver’s face, acquired with a low-cost depth sensor on challenging real-world data. We evaluate the results of our sensor-independent, driver-adaptive approach to those of a state-of-the-art camera-based 2D face tracking system as well as a non-adaptive 3D model relative to own ground-truth data, and compare to other 3D benchmarks. We find large accuracy benefits of the adaptive 3D approach.
The basic idea behind a wearable robotic grasp assistancesystem is to support people that suffer from severe motor impairments in daily activities. Such a system needs to act mostly autonomously and according to the user’s intent. Vision-based hand pose estimation could be an integral part of a larger control and assistance framework. In this paper we evaluate the performance of egocentric monocular hand pose estimation for a robot-controlled hand exoskeleton in a simulation. For hand pose estimation we adopt a Convolutional Neural Network (CNN). We train and evaluate this network with computer graphics, created by our own data generator. In order to guide further design decisions we focus in our experiments on two egocentric camera viewpoints tested on synthetic data with the help of a 3D-scanned hand model, with and without an exoskeleton attached to it.We observe that hand pose estimation with a wrist-mounted camera performs more accurate than with a head-mounted camera in the context of our simulation. Further, a grasp assistance system attached to the hand alters visual appearance and can improve hand pose estimation. Our experiment provides useful insights for the integration of sensors into a context sensitive analysis framework for intelligent assistance.
There is still a great reliance on human expert knowledge during the analog integrated circuit sizing design phase due to its complexity and scale, with the result that there is a very low level of automation associated with it. Current research shows that reinforcement learning is a promising approach for addressing this issue. Similarly, it has been shown that the convergence of conventional optimization approaches can be improved by transforming the design space from the geometrical domain into the electrical domain. Here, this design space transformation is employed as an alternative action space for deep reinforcement learning agents. The presented approach is based entirely on reinforcement learning, whereby agents are trained in the craft of analog circuit sizing without explicit expert guidance. After training and evaluating agents on circuits of varying complexity, their behavior when confronted with a different technology, is examined, showing the applicability, feasibility as well as transferability of this approach.
We present an approach for segmenting individual cells and lamellipodia in epithelial cell clusters using fully convolutional neural networks. The method will set the basis for measuring cell cluster dynamics and expansion to improve the investigation of collective cell migration phenomena. The fully learning-based front-end avoids classical feature engineering, yet the network architecture needs to be designed carefully. Our network predicts how likely each pixel belongs to one of the classes and, thus, is able to segment the image. Besides characterizing segmentation performance, we discuss how the network will be further employed.
Enhancing data-driven algorithms for human pose estimation and action recognition through simulation
(2020)
Recognizing human actions, reliably inferring their meaning and being able to potentially exchange mutual social information are core challenges for autonomous systems when they directly share the same space with humans. Intelligent transport systems in particular face this challenge, as interactions with people are often required. The development and testing of technical perception solutions is done mostly on standard vision benchmark datasets for which manual labelling of sensory ground truth has been a tedious but necessary task. Furthermore, rarely occurring human activities are underrepresented in these datasets, leading to algorithms not recognizing such activities. For this purpose, we introduce a modular simulation framework, which offers to train and validate algorithms on various human-centred scenarios. We describe the usage of simulation data to train a state-of-the-art human pose estimation algorithm to recognize unusual human activities in urban areas. Since the recognition of human actions can be an important component of intelligent transport systems, we investigated how simulations can be applied for his purpose. Laboratory experiments show that we can train a recurrent neural network with only simulated data based on motion capture data and 3D avatars, which achieves an almost perfect performance in the classification of those human actions on real data.
In any autonomous driving system, the map for localization plays a vital part that is often underestimated. The map describes the world around the vehicle outside of the sensor view and is a main input into the decision making process in highly complicated scenarios. Thus there are strict requirements towards the accuracy and timeliness of the map. We present a robust and reliable approach towards crowd based mapping using a GraphSLAM framework based on radar sensors. We show on a parking lot that even in dynamically changing environments, the localization results are very accurate and reliable even in unexplored terrain without any map data. This can be achieved by collaborative map updates from multiple vehicles. To show these claims experimentally, the Joint Graph Optimization is compared to the ground truth on an industrial parking space. Mapping performance is evaluated using a dense map from a total station as reference and localization results are compared with a deeply coupled DGPS/INS system.
On the way to achieving higher degrees of autonomy for vehicles in complicated, ever changing scenarios, the localization problem poses a very important role. Especially the Simultaneous Localization and Mapping (SLAM) problem has been studied greatly in the past. For an autonomous system in the real world, we present a very cost-efficient, robust and very precise localization approach based on GraphSLAM and graph optimization using radar sensors. We are able to prove on a dynamically changing parking lot layout that both mapping and localization accuracy are very high. To evaluate the performance of the mapping algorithm, a highly accurate ground truth map generated from a total station was used. Localization results are compared to a high precision DGPS/INS system. Utilizing these methods, we can show the strong performance of our algorithm.
Learning to translate between real world and simulated 3D sensors while transferring task models
(2019)
Learning-based vision tasks are usually specialized on the sensor technology for which data has been labeled. The knowledge of a learned model is simply useless when it comes to data which differs from the data on which the model has been initially trained or if the model should be applied to a totally different imaging or sensor source. New labeled data has to be acquired on which a new model can be trained. Depending on the sensor, this can even get more complicated when the sensor data becomes more abstract and hard to be interpreted and labeled by humans. To enable reuse of models trained for a specific task across different sensors minimizes the data acquisition effort. Therefore, this work focuses on learning sensor models and translating between them, thus aiming for sensor interoperability. We show that even for the complex task of human pose estimation from 3D depth data recorded with different sensors, i.e. a simulated and a Kinect 2TM depth sensor, human pose estimation can greatly improve by translating between sensor models without modifying the original task model. This process especially benefits sensors and applications for which labels and models are difficult if at all possible to retrieve from raw sensor data.
This paper presents a machine learning powered, procedural sizing methodology based on pre-computed look-up tables containing operating point characteristics of primitive devices. Several Neural Networks are trained for 90nm and 45nm technologies, mapping different electrical parameters to the corresponding dimensions of a primitive device. This transforms the geometric sizing problem into the domain of circuit design experts, where the desired electrical characteristics are now inputs to the model. Analog building blocks or entire circuits are expressed as a sequence of model evaluations, capturing the sizing strategy and intention of the designer in a procedure, which is reusable across different technology nodes. The methodology is employed for the sizing of two operational amplifiers, and evaluated for two technology nodes, showing the versatility and efficiency of this approach.
We present a multitask network that supports various deep neural network based pedestrian detection functions. Besides 2D and 3D human pose, it also supports body and head orientation estimation based on full body bounding box input. This eliminates the need for explicit face recognition. We show that the performance of 3D human pose estimation and orientation estimation is comparable to the state-of-the-art. Since very few data sets exist for 3D human pose and in particular body and head orientation estimation based on full body data, we further show the benefit of particular simulation data to train the network. The network architecture is relatively simple, yet powerful, and easily adaptable for further research and applications.