OPUS 4 | 004 Informatik

004 Informatik

1 search hit

1 to 1

TFCSG: An Unsupervised Approach for Question-retrieval Over Multi-task Learning (2023)

Aiguo, Shang ; Danner, Michael ; Xinjuan, Zhu ; Rätsch, Matthias

Most Question-answering (QA) systems rely on training data to reach their optimal performance. However, acquiring training data for supervised systems is both time-consuming and resource-intensive. To address this, in this paper, we propose TFCSG, an unsupervised similar question retrieval approach that leverages pre-trained language models and multi-task learning. Firstly, topic keywords in question sentences are extracted sequentially based on a latent topic-filtering algorithm to construct unsupervised training corpus data. Then, the multi-task learning method is used to build the question retrieval model. There are three tasks designed. The first is a short sentence contrastive learning task. The second is the question sentence and its corresponding topic sequence similarity judgment task. The third is using question sentences to generate their corresponding topic sequence task. The three tasks are used to train the language model in parallel. Finally, similar questions are obtained by calculating the cosine similarity between sentence vectors. The comparison experiment on public question datasets that TFCSG outperforms the comparative unsupervised baseline method. And there is no need for manual marking, which greatly saves human resources.

1 to 1

Open Access

004 Informatik

Refine

Author

Year of publication

Document Type

Language

Has full text

Is part of the Bibliography

Institute

Publisher

1 search hit