Towards computation of novel ideas from corpora of scientific text

In this work we present a method for the computation of novel 'ideas' from corpora of scientific text. The system functions by first detecting concept noun-phrases within the titles and abstracts of publications using Part-Of-Speech tagging, before classifying these into sets of problem an...

Full description

Bibliographic Details
Main Authors: Liu, Haixia, Goulding, James, Brailsford, Tim
Format: Book Section
Language:English
Published: Springer Verlag 2015
Subjects:
Online Access:https://eprints.nottingham.ac.uk/55719/
_version_ 1848799202928230400
author Liu, Haixia
Goulding, James
Brailsford, Tim
author_facet Liu, Haixia
Goulding, James
Brailsford, Tim
author_sort Liu, Haixia
building Nottingham Research Data Repository
collection Online Access
description In this work we present a method for the computation of novel 'ideas' from corpora of scientific text. The system functions by first detecting concept noun-phrases within the titles and abstracts of publications using Part-Of-Speech tagging, before classifying these into sets of problem and solution phrases via a target-word matching approach. By defining an idea as a co-occurring <problem,solution> pair, known-idea triples can be constructed through the additional assignment of a relevance value (computed via either phrase co-occurrence or an `idea frequency-inverse document frequency' score). The resulting triples are then fed into a collaborative filtering algorithm, where problem-phrases are considered as users and solution-phrases as the items to be recommended. The final output is a ranked list of novel idea candidates, which hold potential for researchers to integrate into their hypothesis generation processes. This approach is evaluated using a subset of publications from the journal Science, with precision, recall and F-Measure results for a variety of model parametrizations indicating that the system is capable of generating useful novel ideas in an automated fashion.
first_indexed 2025-11-14T20:31:55Z
format Book Section
id nottingham-55719
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T20:31:55Z
publishDate 2015
publisher Springer Verlag
recordtype eprints
repository_type Digital Repository
spelling nottingham-557192019-01-11T17:05:52Z https://eprints.nottingham.ac.uk/55719/ Towards computation of novel ideas from corpora of scientific text Liu, Haixia Goulding, James Brailsford, Tim In this work we present a method for the computation of novel 'ideas' from corpora of scientific text. The system functions by first detecting concept noun-phrases within the titles and abstracts of publications using Part-Of-Speech tagging, before classifying these into sets of problem and solution phrases via a target-word matching approach. By defining an idea as a co-occurring <problem,solution> pair, known-idea triples can be constructed through the additional assignment of a relevance value (computed via either phrase co-occurrence or an `idea frequency-inverse document frequency' score). The resulting triples are then fed into a collaborative filtering algorithm, where problem-phrases are considered as users and solution-phrases as the items to be recommended. The final output is a ranked list of novel idea candidates, which hold potential for researchers to integrate into their hypothesis generation processes. This approach is evaluated using a subset of publications from the journal Science, with precision, recall and F-Measure results for a variety of model parametrizations indicating that the system is capable of generating useful novel ideas in an automated fashion. Springer Verlag 2015-08-29 Book Section PeerReviewed application/pdf en https://eprints.nottingham.ac.uk/55719/1/PKDD_Brainstorming.pdf Liu, Haixia, Goulding, James and Brailsford, Tim (2015) Towards computation of novel ideas from corpora of scientific text. In: Machine Learning and Knowledge Discovery in Databases. Springer Verlag, Cham, Switzerland, pp. 541-556. Idea mining Text mining Natural language processing Recommender systems Collaborative filtering https://link.springer.com/chapter/10.1007/978-3-319-23525-7_33 doi:10.1007/978-3-319-23525-7_33 doi:10.1007/978-3-319-23525-7_33
spellingShingle Idea mining
Text mining
Natural language processing
Recommender systems
Collaborative filtering
Liu, Haixia
Goulding, James
Brailsford, Tim
Towards computation of novel ideas from corpora of scientific text
title Towards computation of novel ideas from corpora of scientific text
title_full Towards computation of novel ideas from corpora of scientific text
title_fullStr Towards computation of novel ideas from corpora of scientific text
title_full_unstemmed Towards computation of novel ideas from corpora of scientific text
title_short Towards computation of novel ideas from corpora of scientific text
title_sort towards computation of novel ideas from corpora of scientific text
topic Idea mining
Text mining
Natural language processing
Recommender systems
Collaborative filtering
url https://eprints.nottingham.ac.uk/55719/
https://eprints.nottingham.ac.uk/55719/
https://eprints.nottingham.ac.uk/55719/