OPUS 4 | Search

Fusion of tracking techniques to enhance adaptive real-time tracking of arbitrary objects (2014)

Poschmann, Peter ; Huber, Patrik ; Rätsch, Matthias ; Kittler, Joseph ; Böhme, Hans-Joachim

In visual adaptive tracking, the tracker adapts to the target, background, and conditions of the image sequence. Each update introduces some error, so the tracker might drift away from the target over time. To increase the robustness against the drifting problem, we present three ideas on top of a particle filter framework: An optical-flow-based motion estimation, a learning strategy for preventing bad updates while staying adaptive, and a sliding window detector for failure detection and finding the best training examples. We experimentally evaluate the ideas using the BoBoT dataseta. The code of our tracker is available online.

Person tracking and 3D model-based face analysis for robust human-robot interaction (2015)

Kopp, Philipp ; Grupp, Michael ; Huber, Patrik ; Rätsch, Matthias

We presented our robot framework and our efforts to make face analysis more robust towards self-occlusion caused by head pose. By using a lightweight linear fitting algorithm, we are able to obtain 3D models of human faces in real-time. The combination of adaptive tracking and 3D face modelling for the analysis of human faces is used as a basis for further research on human-machine interaction on our SCITOS robot platform.

Real-time 3D face fitting and texture fusion on in-the-wild videos (2017)

Huber, Patrik ; Kopp, Philipp ; Christmas, William ; Rätsch, Matthias ; Kittler, Josef

We present a fully automatic approach to real-time 3D face reconstruction from monocular in-the-wild videos. With the use of a cascaded-regressor-based face tracking and a 3D morphable face model shape fitting, we obtain a semidense 3D face shape. We further use the texture information from multiple frames to build a holistic 3D face representation from the video footage. Our system is able to capture facial expressions and does not require any person specific training. We demonstrate the robustness of our approach on the challenging 300 Videos in the Wild (300- VW) dataset. Our real-time fitting framework is available as an open-source library at http://4dface.org.

Fast and robust RGB-D scene labeling for autonomous driving (2018)

Jasch, Manuel ; Weber, Thomas ; Rätsch, Matthias

For autonomously driving cars and intelligent vehicles it is crucial to understand the scene context including objects in the surrounding. A fundamental technique accomplishing this is scene labeling. That is, assigning a semantic class to each pixel in a scene image. This task is commonly tackled quite well by fully convolutional neural networks (FCN). Crucial factors are a small model size and a low execution time. This work presents the first method that exploits depth cues together with confidence estimates in a CNN. To this end, novel experimentally grounded network architecture is proposed to perform robust scene labeling that does not require costly preprocessing like CRFs or LSTMs as commonly used in related work. The effectiveness of this approach is demonstrated in an extensive evaluation on a challenging real-world dataset. The new architecture is highly optimized for high accuracy and low execution time.

Emotion model implementation for parameterized facial animation in human-robot-interaction (2016)

Wittig, Steffen ; Kloos, Uwe ; Rätsch, Matthias

In recent years robotic systems have matured enough to perform simple home or office tasks, guide visitors in environments such as museums or stores and aid people in their daily life. To make the interaction with service and even industrial robots as fast and intuitive as possible, researchers strive to create transparent interfaces close to human-human interaction. As facial expressions play a central role in human-human communication, robot faces were implemented with varying degrees of human-likeness and expressiveness. We propose an emotion model to parameterize a screen based facial animation via inter-process communication. A software will animate transitions and add additional animations to make a digital face appear “alive” and equip a robotic system with a virtual face. The result will be an inviting appearance to motivate potential users to seek interaction with the robot.

An interactive clothing design and personalized virtual display system (2018)

Zhu, Xin-juan ; Lu, Haiqing ; Rätsch, Matthias

An interactive clothing design and a personalized virtual display with user’s own face are presented in this paper to meet the requirement of personalized clothing customization. A customer interactive clothing design approach based on genetic engineering ideas is analyzed by taking suit as an example. Thus, customers could rearrange the clothing style elements, chose available color, fabric and come up with their own personalized suit style. A web 3D customization prototype system of personalized clothing is developed based on the Unity3D and VR technology. The layout of the structure and functions combined with the flow of the system are given. Practical issues such as 3D face scanning, suit style design, fabric selection, and accessory choices are addressed also. Tests to the prototype system indicate that it could show realistic clothing and fabric effect and offer effective visual and customization experience to users.

Face naming in news images via multiple instance learning and hybrid recurrent convolutional neural network (2018)

Su, Xueping ; Zhou, Hangchi ; Draghici, Viorel Petrut ; Rätsch, Matthias

Annotations of subject IDs in images are very important as ground truth for face recognition applications and news retrieval systems. Face naming is becoming a significant research topic in news image indexing applications. By exploiting the uniqueness of name, face naming is transformed to the problem of multiple instance learning (MIL) with exclusive constraint, namely the eMIL problem. First, the positive bags and the negative bags are automatically annotated by a hybrid recurrent convolutional neural network and a distributed affinity propagation cluster. Next, positive instance selection and updating are used to reduce the influence of false-positive bag and to improve the performance. Finally, max exclusive density and iterative Max-ED algorithms are proposed to solve the eMIL problem. The experimental results show that the proposed algorithms achieve a significant improvement over other algorithms.

Aesthetic classification of face images based on convolutional neural network model (2019)

Wu, Fei ; Zhu, Xinjuan ; Wu, Xiaojun ; Rätsch, Matthias

Aimed at the problem that the accuracy of face image classification in complex environment is not high, a network model F-Net suitable for aesthetic classification of face images is proposed. Based on LeNet-5, the model uses convolutional layers to extract facial image features in complex backgrounds, optimized parameters in the network model, and changes the number of convolutional layers and fully connected layer feature elements in the model. The experimental results show that the F-Net network model proposed in this paper has a face image classifation accuracy of 73% in complex environment background, which is better than other classical convolutional neural network classification models.

Mobile-Unet: An efficient convolutional neural network for fabric defect detection (2020)

Jing, Junfeng ; Wang, Zhen ; Rätsch, Matthias ; Zhang, Huanhuan

Deep learning-based fabric defect detection methods have been widely investigated to improve production efficiency and product quality. Although deep learning-based methods have proved to be powerful tools for classification and segmentation, some key issues remain to be addressed when applied to real applications. Firstly, the actual fabric production conditions of factories necessitate higher real-time performance of methods. Moreover, fabric defects as abnormal samples are very rare compared with normal samples, which results in data imbalance. It makes model training based on deep learning challenging. To solve these problems, an extremely efficient convolutional neural network, Mobile-Unet, is proposed to achieve the end-to-end defect segmentation. The median frequency balancing loss function is used to overcome the challenge of sample imbalance. Additionally, Mobile-Unet introduces depth-wise separable convolution, which dramatically reduces the complexity cost and model size of the network. It comprises two parts: encoder and decoder. The MobileNetV2 feature extractor is used as the encoder, and then five deconvolution layers are added as the decoder. Finally, the softmax layer is used to generate the segmentation mask. The performance of the proposed model has been evaluated by public fabric datasets and self-built fabric datasets. In comparison with other methods, the experimental results demonstrate that segmentation accuracy and detection speed in the proposed method achieve state-of-the-art performance.

Personalized clothing recommendation based on user emotional analysis (2020)

Su, Xueping ; Gao, Meng ; Ren, Jie ; Li, Yunhong ; Rätsch, Matthias

With the continuous development of economy, consumers pay more attention to the demand for personalization clothing. However, the recommendation quality of the existing clothing recommendation system is not enough to meet the user’s needs. When browsing online clothing, facial expression is the salient information to understand the user’s preference. In this paper, we propose a novel method to automatically personalize clothing recommendation based on user emotional analysis. Firstly, the facial expression is classified by multiclass SVM. Next, the user’s multi-interest value is calculated using expression intensity that is obtained by hybrid RCNN. Finally, the multi-interest value is fused to carry out personalized recommendation. The experimental results show that the proposed method achieves a significant improvement over other algorithms.

Automatic identification of focus personage in multi-lingual news images (2021)

Su, Xueping ; Zhu, Danyao ; Ren, Jie ; Rätsch, Matthias

Annotations of character IDs in news images are critical as ground truth for news retrieval and recommendation system. Universality and accuracy optimization of deep neural network models constitutes the key technology to improve the precision and computing efficiency of automatic news character identification, which is attracting increased attention globally. This paper explores the optimized deep neural network model for automatic focus personage identification in multi-lingual news. First, the face model of the focus personage is trained by using the corresponding face images from German news as positive samples. Next, the scheme of Recurrent Convolutional Neural Network (RCNN) + Bi-directional Long-Short Term Memory (Bi-LSTM) + Conditional Random Field (CRF) is utilized to label the focus name, and the RCNN-RCNN encoder–decoder is applied to translate names of people into multiple languages. Third, face features are described by combining the advantages of Local Gabor Binary Pattern Histogram Sequence (LGBPHS) and RCNN, and iterative quantization (ITQ) is used to binarize codes. Finally, a name semantic network is built for different domains. Experiments are performed on a dataset which comprises approximately 100,000 news images. The experimental results demonstrate that the proposed method achieves a significant improvement over other algorithms.

Deep adversarial domain adaptation model for bearing fault diagnosis (2021)

Liu, Zhao-Hua ; Lu, Bi-Liang ; Wei, Hua-Liang ; Rätsch, Matthias

Fault diagnosis of rolling bearings is an essential process for improving the reliability and safety of the rotating machinery. It is always a major challenge to ensure fault diag- nosis accuracy in particular under severe working conditions. In this article, a deep adversarial domain adaptation (DADA) model is proposed for rolling bearing fault diagnosis. This model con- structs an adversarial adaptation network to solve the commonly encountered problem in numerous real applications: the source domain and the target domain are inconsistent in their distribution. First, a deep stack autoencoder (DSAE) is combined with representative feature learning for dimensionality reduction, and such a combination provides an unsupervised learning method to effectively acquire fault features. Meanwhile, domain adaptation and recognition classification are implemented using a Softmax classifier to augment classification accuracy. Second, the effects of the number of hidden layers in the stack autoencoder network, the number of neurons in each hidden layer, and the hyperparameters of the proposed fault diagnosis algorithm are analyzed. Third, comprehensive analysis is performed on real data to vali- date the performance of the proposed method; the experimental results demonstrate that the new method outperforms the existing machine learning and deep learning methods, in terms of classification accuracy and generalization ability.

Personalized clothing recommendation fusing the 4-season color system and users’ biological characteristics (2023)

Su, Xueping ; Duan, Jiawei ; Li, Yunhong ; Danner, Michael ; Rätsch, Matthias ; Peng, Jinye

In clothing e-commerce, the challenge of optimally recommending clothing that suits a user’s unique characteristics remains a pressing issue. Many platforms simply recommend best-selling or popular clothing, without taking into account important attributes like user’s face color, pupil color, face shape, age, etc. To solve this problem, this paper proposes a personalized clothing recommendation algorithm that incorporates the established 4-Season Color System and user-specific biological characteristics. Firstly, the attributes and colors of clothing are classified by Fnet network, that can learn disjoint label combinations and mitigate the issue of excessive labels. Secondly, on the basis of the 4-Season Color System, the user’s face color model is trained by combined MobileNetV3_DTL, which ensures the model’s generalization and improves the training speed. Thirdly, user’s face shape and age are divided into different categories by an Inception network. Finally, according to the users’ face color, age, face shape and other information, personalized clothing is recommended in a coarse-to-fine manner. Experiments on five datasets demonstrate that the algorithm proposed in this paper achieves state-of-the-art results.

Pre‑training neural machine translation with alignment information via optimal transport (2023)

Su, Xueping ; Zhao, Xingkai ; Ren, Jie ; Li, Yunhong ; Rätsch, Matthias

With the rapid development of globalization, the demand for translation between different languages is also increasing. Although pre-training has achieved excellent results in neural machine translation, the existing neural machine translation has almost no high-quality suitable for specific fields. Alignment information, so this paper proposes a pre-training neural machine translation with alignment information via optimal transport. First, this paper narrows the representation gap between different languages by using OTAP to generate domain-specific data for information alignment, and learns richer semantic information. Secondly, this paper proposes a lightweight model DR-Reformer, which uses Reformer as the backbone network, adds Dropout layers and Reduction layers, reduces model parameters without losing accuracy, and improves computational efficiency. Experiments on the Chinese and English datasets of AI Challenger 2018 and WMT-17 show that the proposed algorithm has better performance than existing algorithms.

Simulating temporally and spatially correlated wind speed time series by spectral representation method (2023)

Xiao, Qing ; Wu, Lianghong ; Wu, Xiaowen ; Rätsch, Matthias

In this paper, it aims to model wind speed time series at multiple sites. The five-parameter Johnson distribution is deployed to relate the wind speed at each site to a Gaussian time series, and the resultant m-dimensional Gaussian stochastic vector process Z(t) is employed to model the temporal-spatial correlation of wind speeds at m different sites. In general, it is computationally tedious to obtain the autocorrelation functions (ACFs) and cross-correlation functions (CCFs) of Z(t), which are different to those of wind speed times series. In order to circumvent this correlation distortion problem, the rank ACF and rank CCF are introduced to characterize the temporal-spatial correlation of wind speeds, whereby the ACFs and CCFs of Z(t) can be analytically obtained. Then, Fourier transformation is implemented to establish the cross-spectral density matrix of Z(t), and an analytical approach is proposed to generate samples of wind speeds at m different sites. Finally, simulation experiments are performed to check the proposed methods, and the results verify that the five-parameter Johnson distribution can accurately match distribution functions of wind speeds, and the spectral representation method can well reproduce the temporal-spatial correlation of wind speeds.

Author(s)
Title
Additional person(s)
Publisher
Supervisor(s)
Abstract
Full text

Open Access

Refine

Author

Year of publication

Document Type

Language

Has full text

Is part of the Bibliography

Institute

Publisher

15 search hits