Informatik
Refine
Document Type
- Conference proceeding (17)
- Journal article (7)
Has full text
- yes (24)
Is part of the Bibliography
- yes (24)
Institute
- Informatik (24)
Publisher
- IEEE (12)
- De Gruyter (2)
- Springer (2)
- ARVO (1)
- Association for Computing Machinery (1)
- Deutsche Gesellschaft für Computer- und Roboterassistierte Chirurgie e.V. (1)
- PLOS (1)
- Sage Publishing (1)
- Taylor & Francis (1)
- Universität des Saarlandes (1)
Facial expressions play a dominant role in facilitating social interactions. We endeavor to develop tactile displays to reinstate facial expression modulated communication. The high spatial and temporal dimensionality of facial movements poses a unique challenge when designing tactile encodings of them. A further challenge is developing encodings that are at-tuned to the perceptual characteristics of our skin. A caveat of using vibrotactile displays is that tactile stimuli have been shown to induce perceptual tactile aftereffects when used on the fingers, arm and face. However, at present, despite the prevalence of waist-worn tactile displays, no such investigations of tactile aftereffects at the waist region exist in the literature, though they are warranted by the unique sensory and perceptual signalling characteristics of this area. Using an adaptation paradigm we investigated the presence of perceptual tactile aftereffects induced by continuous and burst vibrotactile stimuli delivered at the navel, side and spinal regions of the waist. We report evidence that the tactile perception topology of the waist is non-uniform, and specifically that the navel and spine regions are resistant to adaptive aftereffects while side regions are more prone to perceptual adaptations to continuous but not burst stimulations. Results of our current investigations highlight the unique set of challenges posed by designing waist-worn tactile displays. These and future perceptual studies can directly inform more realistic and effective implementations of complex high-dimensional spatiotemporal social cues.
In any autonomous driving system, the map for localization plays a vital part that is often underestimated. The map describes the world around the vehicle outside of the sensor view and is a main input into the decision making process in highly complicated scenarios. Thus there are strict requirements towards the accuracy and timeliness of the map. We present a robust and reliable approach towards crowd based mapping using a GraphSLAM framework based on radar sensors. We show on a parking lot that even in dynamically changing environments, the localization results are very accurate and reliable even in unexplored terrain without any map data. This can be achieved by collaborative map updates from multiple vehicles. To show these claims experimentally, the Joint Graph Optimization is compared to the ground truth on an industrial parking space. Mapping performance is evaluated using a dense map from a total station as reference and localization results are compared with a deeply coupled DGPS/INS system.
Significant advances have been achieved in mobile robot localization and mapping in dynamic environments, however these are mostly incapable of dealing with the physical properties of automotive radar sensors. In this paper we present an accurate and robust solution to this problem, by introducing a memory efficient cluster map representation. Our approach is validated by experiments that took place on a public parking space with pedestrians, moving cars, as well as different parking configurations to provide a challenging dynamic environment. The results prove its ability to reproducibly localize our vehicle within an error margin of below 1% with respect to ground truth using only point based radar targets. A decay process enables our map representation to support local updates.
On the way to achieving higher degrees of autonomy for vehicles in complicated, ever changing scenarios, the localization problem poses a very important role. Especially the Simultaneous Localization and Mapping (SLAM) problem has been studied greatly in the past. For an autonomous system in the real world, we present a very cost-efficient, robust and very precise localization approach based on GraphSLAM and graph optimization using radar sensors. We are able to prove on a dynamically changing parking lot layout that both mapping and localization accuracy are very high. To evaluate the performance of the mapping algorithm, a highly accurate ground truth map generated from a total station was used. Localization results are compared to a high precision DGPS/INS system. Utilizing these methods, we can show the strong performance of our algorithm.
Avatars are in use when interacting in virtual environments in different contexts, in collaborative work, as well as in gaming and also in virtual meetings with friends. Therefore it is important to understand how the relationship between user and avatar works. In this study, an online survey is used to determine how the perception of an avatar changes in different contexts by relating it to existing avatar relationship typologies. Additionally, it is determined whether in each context a realistic, abstract or comic-like representation is preferred by the participants. One result was a preference of low poly representations in the work context, which are associated with the perception of the avatar as a tool. In the context of meeting friends, a realistic representation is perceived as more appropriate, which is perceived as an accurate self-representation. In the gaming context, the results are less clear, which can be attributed to different gaming preferences. Here, unlike in the other contexts, a comic-like representation is also perceived as appropriate, which is associated with the perception of the avatar as a friend. A symbiotic user-avatar relationship is not directly related to any form of representation, but always lies in the midfield, which is attributed to the fact that it represents a whole spectrum between other categories.
Recognizing actions of humans, reliably inferring their meaning and being able to potentially exchange mutual social information are core challenges for autonomous systems when they directly share the same space with humans. Today’s technical perception solutions have been developed and tested mostly on standard vision benchmark datasets where manual labeling of sensory ground truth is a tedious but necessary task. Furthermore, rarely occurring human activities are underrepresented in such data leading to algorithms not recognizing such activities. For this purpose, we introduce a modular simulation framework which offers to train and validate algorithms on various environmental conditions. For this paper we created a dataset, containing rare human activities in urban areas, on which a current state of the art algorithm for pose estimation fails and demonstrate how to train such rare poses with simulated data only.
Recognizing human actions is a core challenge for autonomous systems as they directly share the same space with humans. Systems must be able to recognize and assess human actions in real-time. To train the corresponding data-driven algorithms, a significant amount of annotated training data is required. We demonstrate a pipeline to detect humans, estimate their pose, track them over time and recognize their actions in real-time with standard monocular camera sensors. For action recognition, we transform noisy human pose estimates in an image like format we call Encoded Human Pose Image (EHPI). This encoded information can further be classified using standard methods from the computer vision community. With this simple procedure, we achieve competitive state-of-the-art performance in pose based action detection and can ensure real-time performance. In addition, we show a use case in the context of autonomous driving to demonstrate how such a system can be trained to recognize human actions using simulation data.
Enhancing data-driven algorithms for human pose estimation and action recognition through simulation
(2020)
Recognizing human actions, reliably inferring their meaning and being able to potentially exchange mutual social information are core challenges for autonomous systems when they directly share the same space with humans. Intelligent transport systems in particular face this challenge, as interactions with people are often required. The development and testing of technical perception solutions is done mostly on standard vision benchmark datasets for which manual labelling of sensory ground truth has been a tedious but necessary task. Furthermore, rarely occurring human activities are underrepresented in these datasets, leading to algorithms not recognizing such activities. For this purpose, we introduce a modular simulation framework, which offers to train and validate algorithms on various human-centred scenarios. We describe the usage of simulation data to train a state-of-the-art human pose estimation algorithm to recognize unusual human activities in urban areas. Since the recognition of human actions can be an important component of intelligent transport systems, we investigated how simulations can be applied for his purpose. Laboratory experiments show that we can train a recurrent neural network with only simulated data based on motion capture data and 3D avatars, which achieves an almost perfect performance in the classification of those human actions on real data.
Die Segmentierung und das Tracking von minimal-invasiven robotergeführten Instrumenten ist ein wesentlicher Bestandteil für verschiedene computer assistierte Eingriffe. Allerdings treten in der minimal-invasiven Chirurgie, die das Anwendungsfeld für den hier beschriebenen Ansatz darstellt, häufig Schwierigkeiten durch Reflexionen, Schatten oder visuelle Verdeckungen durch Rauch und Organe auf und erschweren die Segmentierung und das Tracking der Instrumente.
Dieser Beitrag stellt einen Deep Learning Ansatz für ein markerloses Tracking von minimal-invasiven Instrumenten vor und wird sowohl auf simulierten als auch realen Daten getestet. Es wird ein simulierter als auch realer Datensatz mit Ground Truth Kennzeichnung für die binäre Segmentierung von Instrument und Hintergrund erstellt. Für den simulierten Datensatz werden Bilder aus einem simulierten Instrument und realem Hintergrund zusammengesetzt. Im Falle des realen Datensatzes spricht man von der Zusammensetzung der Bilder aus einem realen Instrument und Hintergrund. Insgesamt wird auf den simulierten Daten eine Pixelgenauigkeit von 94.70 Prozent und auf den realen Daten eine Pixelgenauigkeit von 87.30 Prozent erreicht.
As production workspaces become more mobile and dynamic it becomes increasingly important to reliably monitor the overall state of the environment. Therein manipulators or other robotic systems likely have to be able to act autonomously together with humans and other systems within a joint workspace. Such interactions require that all components in non-stationary environments are able to perceive the state relative to each other. As vision-sensors provide a rich source of information to accomplish this, we present RoPose, a convolutional neural network (CNN) based approach, to estimate the two dimensional joint configuration of a simulated industrial manipulator from a camera image. This pose information can further be used by a novel targetless calibration setup to estimate the pose of the camera relative to the manipulator’s space. We present a pipeline to automatically generate synthetic training data and conclude with a discussion of the potential usage of the same pipeline to acquire real image datasets of physically existent robots.