Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes

Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will us...

Full description

Bibliographic Details
Main Author: Wahlan, Mohammed Salem Farag
Format: Thesis
Language:English
Published: 2006
Subjects:
Online Access:http://eprints.utm.my/4067/
http://eprints.utm.my/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf
_version_ 1848890710033432576
author Wahlan, Mohammed Salem Farag
author_facet Wahlan, Mohammed Salem Farag
author_sort Wahlan, Mohammed Salem Farag
building UTeM Institutional Repository
collection Online Access
description Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will usually retrieve some relevant theses not retrieved by other methods. Therefore in this study, different methods are used in the theses retrieval, based on different thesis structures, different similarity measures and different weighting schemes. The theses used in this study are collected from FSKSM postgraduate library. Many operations have been applied on the collected theses such as digitizing, stop words removal, stemming and building index. The results from these operations are stored in a database. In this study, 85 theses and 30 queries are used. The comparisons between query and theses were made using five similarity measures with seven weighting schemes using different thesis structures. The results show that the use of bibliography gives poorer results compared to the use of title and abstract alone. In the weighting schemes combinations, the results show that weighting schemes using Cosine and Tanimoto perform well individually but did not do well in the combinations and weighting schemes using Forbes and Russell similarity measures do not do well individually but did well in the combination. In the similarity measures combinations, the results show that the best combination was Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure but using abstract structure, the best combination was Cosine using TFIDF weighting scheme with Forbes using ATFA weighting scheme but it has less performance than the combination of Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure. The overall results show that the best thesis structure is title and the best similarity measure is Cosine with LTU weighting scheme.
first_indexed 2025-11-15T20:46:23Z
format Thesis
id utm-4067
institution Universiti Teknologi Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T20:46:23Z
publishDate 2006
recordtype eprints
repository_type Digital Repository
spelling utm-40672018-01-15T04:24:11Z http://eprints.utm.my/4067/ Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes Wahlan, Mohammed Salem Farag QA75 Electronic computers. Computer science Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will usually retrieve some relevant theses not retrieved by other methods. Therefore in this study, different methods are used in the theses retrieval, based on different thesis structures, different similarity measures and different weighting schemes. The theses used in this study are collected from FSKSM postgraduate library. Many operations have been applied on the collected theses such as digitizing, stop words removal, stemming and building index. The results from these operations are stored in a database. In this study, 85 theses and 30 queries are used. The comparisons between query and theses were made using five similarity measures with seven weighting schemes using different thesis structures. The results show that the use of bibliography gives poorer results compared to the use of title and abstract alone. In the weighting schemes combinations, the results show that weighting schemes using Cosine and Tanimoto perform well individually but did not do well in the combinations and weighting schemes using Forbes and Russell similarity measures do not do well individually but did well in the combination. In the similarity measures combinations, the results show that the best combination was Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure but using abstract structure, the best combination was Cosine using TFIDF weighting scheme with Forbes using ATFA weighting scheme but it has less performance than the combination of Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure. The overall results show that the best thesis structure is title and the best similarity measure is Cosine with LTU weighting scheme. 2006-03 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf Wahlan, Mohammed Salem Farag (2006) Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information System.
spellingShingle QA75 Electronic computers. Computer science
Wahlan, Mohammed Salem Farag
Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_full Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_fullStr Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_full_unstemmed Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_short Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_sort comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
topic QA75 Electronic computers. Computer science
url http://eprints.utm.my/4067/
http://eprints.utm.my/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf