作者:Debayan Banerjee Pranav Ajit Nair Ricardo Usbeck Chris Biemann
在这项工作中,我们提出了一个名为GETT-QA的端到端知识图问答(KGQA)系统。GETT-QA使用T5,这是一种流行的文本到文本预训练语言模型。该模型以自然语言中的一个问题作为输入,并生成一个更简单的SPARQL查询形式。在更简单的形式中,模型不直接生成实体和关系ID。相反,它会生成相应的实体和关系标签。在接下来的步骤中,标签将固定到KG实体和关系ID。为了进一步改进结果,我们指示模型为每个实体生成KG嵌入的截断版本。截断的KG嵌入实现了更精细的搜索以消除歧义。我们发现T5能够在不改变损失函数的情况下学习截断的KG嵌入,从而提高了KGQA的性能。因此,我们报告了LC QuAD 2.0和SimpleQuestions Wikidata数据集在端到端KGQA over Wikidata上的强劲结果。
In this work, we present an end-to-end Knowledge Graph Question Answering (KGQA) system named GETT-QA. GETT-QA uses T5, a popular text-to-text pre-trained language model. The model takes a question in natural language as input and produces a simpler form of the intended SPARQL query. In the simpler form, the model does not directly produce entity and relation IDs. Instead, it produces corresponding entity and relation labels. The labels are grounded to KG entity and relation IDs in a subsequent step. To further improve the results, we instruct the model to produce a truncated version of the KG embedding for each entity. The truncated KG embedding enables a finer search for disambiguation purposes. We find that T5 is able to learn the truncated KG embeddings without any change of loss function, improving KGQA performance. As a result, we report strong results for LC-QuAD 2.0 and SimpleQuestions-Wikidata datasets on end-to-end KGQA over Wikidata.
论文链接:http://arxiv.org/pdf/2303.13284v1
更多计算机论文:http://cspaper.cn/