Towards computation of novel ideas from corpora of scientific text
In this work we present a method for the computation of novel 'ideas' from corpora of scientific text. The system functions by first detecting concept noun-phrases within the titles and abstracts of publications using Part-Of-Speech tagging, before classifying these into sets of problem an...
| Main Authors: | , , |
|---|---|
| Format: | Book Section |
| Language: | English |
| Published: |
Springer Verlag
2015
|
| Subjects: | |
| Online Access: | https://eprints.nottingham.ac.uk/55719/ |
| _version_ | 1848799202928230400 |
|---|---|
| author | Liu, Haixia Goulding, James Brailsford, Tim |
| author_facet | Liu, Haixia Goulding, James Brailsford, Tim |
| author_sort | Liu, Haixia |
| building | Nottingham Research Data Repository |
| collection | Online Access |
| description | In this work we present a method for the computation of novel 'ideas' from corpora of scientific text. The system functions by first detecting concept noun-phrases within the titles and abstracts of publications using Part-Of-Speech tagging, before classifying these into sets of problem and solution phrases via a target-word matching approach. By defining an idea as a co-occurring <problem,solution> pair, known-idea triples can be constructed through the additional assignment of a relevance value (computed via either phrase co-occurrence or an `idea frequency-inverse document frequency' score). The resulting triples are then fed into a collaborative filtering algorithm, where problem-phrases are considered as users and solution-phrases as the items to be recommended. The final output is a ranked list of novel idea candidates, which hold potential for researchers to integrate into their hypothesis generation processes. This approach is evaluated using a subset of publications from the journal Science, with precision, recall and F-Measure results for a variety of model parametrizations indicating that the system is capable of generating useful novel ideas in an automated fashion. |
| first_indexed | 2025-11-14T20:31:55Z |
| format | Book Section |
| id | nottingham-55719 |
| institution | University of Nottingham Malaysia Campus |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-14T20:31:55Z |
| publishDate | 2015 |
| publisher | Springer Verlag |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | nottingham-557192019-01-11T17:05:52Z https://eprints.nottingham.ac.uk/55719/ Towards computation of novel ideas from corpora of scientific text Liu, Haixia Goulding, James Brailsford, Tim In this work we present a method for the computation of novel 'ideas' from corpora of scientific text. The system functions by first detecting concept noun-phrases within the titles and abstracts of publications using Part-Of-Speech tagging, before classifying these into sets of problem and solution phrases via a target-word matching approach. By defining an idea as a co-occurring <problem,solution> pair, known-idea triples can be constructed through the additional assignment of a relevance value (computed via either phrase co-occurrence or an `idea frequency-inverse document frequency' score). The resulting triples are then fed into a collaborative filtering algorithm, where problem-phrases are considered as users and solution-phrases as the items to be recommended. The final output is a ranked list of novel idea candidates, which hold potential for researchers to integrate into their hypothesis generation processes. This approach is evaluated using a subset of publications from the journal Science, with precision, recall and F-Measure results for a variety of model parametrizations indicating that the system is capable of generating useful novel ideas in an automated fashion. Springer Verlag 2015-08-29 Book Section PeerReviewed application/pdf en https://eprints.nottingham.ac.uk/55719/1/PKDD_Brainstorming.pdf Liu, Haixia, Goulding, James and Brailsford, Tim (2015) Towards computation of novel ideas from corpora of scientific text. In: Machine Learning and Knowledge Discovery in Databases. Springer Verlag, Cham, Switzerland, pp. 541-556. Idea mining Text mining Natural language processing Recommender systems Collaborative filtering https://link.springer.com/chapter/10.1007/978-3-319-23525-7_33 doi:10.1007/978-3-319-23525-7_33 doi:10.1007/978-3-319-23525-7_33 |
| spellingShingle | Idea mining Text mining Natural language processing Recommender systems Collaborative filtering Liu, Haixia Goulding, James Brailsford, Tim Towards computation of novel ideas from corpora of scientific text |
| title | Towards computation of novel ideas from corpora of scientific text |
| title_full | Towards computation of novel ideas from corpora of scientific text |
| title_fullStr | Towards computation of novel ideas from corpora of scientific text |
| title_full_unstemmed | Towards computation of novel ideas from corpora of scientific text |
| title_short | Towards computation of novel ideas from corpora of scientific text |
| title_sort | towards computation of novel ideas from corpora of scientific text |
| topic | Idea mining Text mining Natural language processing Recommender systems Collaborative filtering |
| url | https://eprints.nottingham.ac.uk/55719/ https://eprints.nottingham.ac.uk/55719/ https://eprints.nottingham.ac.uk/55719/ |