Local word embeddings for query expansion based on co-authorship and citations

© Copyright 2018 for the individual papers by the papers' authors. Word embedding techniques have gained a lot of interest from natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for a search query. These related t...

Full description

Bibliographic Details
Main Authors: Rattinger, A., Le Goff, J., Guetl, Christian
Format: Conference Paper
Published: 2018
Online Access:http://hdl.handle.net/20.500.11937/66959
_version_ 1848761437821861888
author Rattinger, A.
Le Goff, J.
Guetl, Christian
author_facet Rattinger, A.
Le Goff, J.
Guetl, Christian
author_sort Rattinger, A.
building Curtin Institutional Repository
collection Online Access
description © Copyright 2018 for the individual papers by the papers' authors. Word embedding techniques have gained a lot of interest from natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for a search query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use different jargon. Using the Skip-Gram algorithm of Word2Vec, terms are selected only from a specific subset of the corpus, which is extended by documents from co-authorship and citations. We demonstrate that locally-trained word embeddings with this extension provides a valuable augmentation and can improve retrieval performance. First result suggest that query expansion and word embeddings could also benefit from other related information.
first_indexed 2025-11-14T10:31:40Z
format Conference Paper
id curtin-20.500.11937-66959
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T10:31:40Z
publishDate 2018
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-669592018-05-18T07:56:44Z Local word embeddings for query expansion based on co-authorship and citations Rattinger, A. Le Goff, J. Guetl, Christian © Copyright 2018 for the individual papers by the papers' authors. Word embedding techniques have gained a lot of interest from natural language processing researchers recently and they are valuable resource in identifying a list of semantically related terms for a search query. These related terms build a natural addition for query expansion, but might mismatch when the application domains use different jargon. Using the Skip-Gram algorithm of Word2Vec, terms are selected only from a specific subset of the corpus, which is extended by documents from co-authorship and citations. We demonstrate that locally-trained word embeddings with this extension provides a valuable augmentation and can improve retrieval performance. First result suggest that query expansion and word embeddings could also benefit from other related information. 2018 Conference Paper http://hdl.handle.net/20.500.11937/66959 restricted
spellingShingle Rattinger, A.
Le Goff, J.
Guetl, Christian
Local word embeddings for query expansion based on co-authorship and citations
title Local word embeddings for query expansion based on co-authorship and citations
title_full Local word embeddings for query expansion based on co-authorship and citations
title_fullStr Local word embeddings for query expansion based on co-authorship and citations
title_full_unstemmed Local word embeddings for query expansion based on co-authorship and citations
title_short Local word embeddings for query expansion based on co-authorship and citations
title_sort local word embeddings for query expansion based on co-authorship and citations
url http://hdl.handle.net/20.500.11937/66959