logo资料库

基于知识图谱的问答系统.pdf

第1页 / 共24页
第2页 / 共24页
第3页 / 共24页
第4页 / 共24页
第5页 / 共24页
第6页 / 共24页
第7页 / 共24页
第8页 / 共24页
资料共24页,剩余部分请下载后查看
KBQA: Learning Question Answering over QA Corpora and Knowledge Bases Wanyun Cui@FUDAN, Yanghua Xiao*@FUDAN, Haixun Wang@Facebook, Yangqiu Song@HKUST, Seung-won Hwang@Yonsei , Wei Wang@FUDAN kw.fudan.edu.cn/qa
Backgrounds • Question Answering (QA) systems answer natural language questions. IBM Watson Google Now Apple Siri Amazon Alexa Microsof Cortana kw.fudan.edu.cn/qa
Why QA • QA application: • One of the most natural human-computer interaction • Key components of Chatbot, which attracts wide research interests from industries • QA for AI: • One of most important tasks to evaluate the machine intelligence: Turing test • Important testbed of many AI techniques, such as machine learning, natural language processing, machine cognition kw.fudan.edu.cn/qa Turing test
Why KBQA? More and More Knowledge bases are created • Google Knowledge graph, Yago,WordNet, FreeBase, Probase, NELL, CYC, DBPedia • Large scale, clean data The boost of knowledge bases A piece of knowledge base, which consist of triples such as (d, population, 390k) kw.fudan.edu.cn/qa
How KB-based QA works? • Convert natural language questions into structured queries over knowledge bases. How many people live in Honolulu? SPARQL Select ?number Where { Res:Honolulu dbo:population ?num } SQL Select value From KB Where subject=‘d’ and predicate=‘population’ • Key: predicate inference kw.fudan.edu.cn/qa
Two challenges for predicate inference • Question Representation • Identify questions with the same semantics • Distinguish questions with different intents • Semantic matching • Map the question representation to the predicate in the KB • Vocabulary gap kw.fudan.edu.cn/qa
Weakness of previous solutions • Template/rule based approaches • Neural network based approaches • Questions are strings • Represent questions by string based templates, such as regular expression • Questions are numeric • Represent questions by numeric embeddings • By human labeling • By learning from corpus • PROs: • User-controllable • Applicable to industry use • CONs: • Costly human efforts. • Not good at handling the diversity of questions. • PROs: • Feasible to understand diverse questions • CONs: • Poor interpretability • Not controllable. Unfriendly to industrial application. How to retain advantages from both approaches? kw.fudan.edu.cn/qa
Our approach • Representation: concept based templates. • Questions are asking about entities • Interpretable • User-controllable • Learn templates from QA corpus, instead of manfully construction. • 27 million templates, 2782 intents • Understand diverse questions kw.fudan.edu.cn/qa
分享到:
收藏