Local word embeddings for query expansion based on co-authorship and citations

© Copyright 2018 for the individual papers by the papers' authors. Word embedding techniques have gained a lot of interest from natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for a search query. These related t...

Full description

Bibliographic Details
Main Authors: Rattinger, A., Le Goff, J., Guetl, Christian
Format: Conference Paper
Published: 2018
Online Access:http://hdl.handle.net/20.500.11937/66959
Description
Summary:© Copyright 2018 for the individual papers by the papers' authors. Word embedding techniques have gained a lot of interest from natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for a search query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use different jargon. Using the Skip-Gram algorithm of Word2Vec, terms are selected only from a specific subset of the corpus, which is extended by documents from co-authorship and citations. We demonstrate that locally-trained word embeddings with this extension provides a valuable augmentation and can improve retrieval performance. First result suggest that query expansion and word embeddings could also benefit from other related information.