Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering
Ranking is an important task for handling a large amount of content. Ideally, training data for supervised ranking would include a complete rank of documents (or other objects such as images or videos) for a particular query. However, this is only possible for small sets of documents. In practice, o...
| Main Authors: | , , |
|---|---|
| Other Authors: | |
| Format: | Conference Paper |
| Published: |
Omnipress
2011
|
| Online Access: | http://hdl.handle.net/20.500.11937/17402 |
| _version_ | 1848749456948723712 |
|---|---|
| author | Tran, Truyen Phung, Dinh Venkatesh, Svetha |
| author2 | Not known |
| author_facet | Not known Tran, Truyen Phung, Dinh Venkatesh, Svetha |
| author_sort | Tran, Truyen |
| building | Curtin Institutional Repository |
| collection | Online Access |
| description | Ranking is an important task for handling a large amount of content. Ideally, training data for supervised ranking would include a complete rank of documents (or other objects such as images or videos) for a particular query. However, this is only possible for small sets of documents. In practice, one often resorts to document rating, in that a subset of documents is assigned with a small number indicating the degree of relevance. This poses a general problem of modelling and learning rank data with ties. In this paper, we propose a probabilistic generative model, that models the process as permutations over partitions. This results in super-exponential combinatorial state space with unknown numbers of partitions and unknown ordering among them. We approach the problem from the discrete choice theory, where subsets are chosen in a stage wise manner, reducing the state space per each stage significantly. Further, we show that with suitable parameterisation, we can still learn the models in linear time. We evaluate the proposed models on two application areas: (i) document ranking with the data from the recently held Yahoo! challenge, and (ii) collaborative filtering with movie data. The results demonstrate that the models are competitive against well-known rivals. |
| first_indexed | 2025-11-14T07:21:14Z |
| format | Conference Paper |
| id | curtin-20.500.11937-17402 |
| institution | Curtin University Malaysia |
| institution_category | Local University |
| last_indexed | 2025-11-14T07:21:14Z |
| publishDate | 2011 |
| publisher | Omnipress |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | curtin-20.500.11937-174022017-05-30T08:02:17Z Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering Tran, Truyen Phung, Dinh Venkatesh, Svetha Not known Ranking is an important task for handling a large amount of content. Ideally, training data for supervised ranking would include a complete rank of documents (or other objects such as images or videos) for a particular query. However, this is only possible for small sets of documents. In practice, one often resorts to document rating, in that a subset of documents is assigned with a small number indicating the degree of relevance. This poses a general problem of modelling and learning rank data with ties. In this paper, we propose a probabilistic generative model, that models the process as permutations over partitions. This results in super-exponential combinatorial state space with unknown numbers of partitions and unknown ordering among them. We approach the problem from the discrete choice theory, where subsets are chosen in a stage wise manner, reducing the state space per each stage significantly. Further, we show that with suitable parameterisation, we can still learn the models in linear time. We evaluate the proposed models on two application areas: (i) document ranking with the data from the recently held Yahoo! challenge, and (ii) collaborative filtering with movie data. The results demonstrate that the models are competitive against well-known rivals. 2011 Conference Paper http://hdl.handle.net/20.500.11937/17402 Omnipress fulltext |
| spellingShingle | Tran, Truyen Phung, Dinh Venkatesh, Svetha Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering |
| title | Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering |
| title_full | Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering |
| title_fullStr | Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering |
| title_full_unstemmed | Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering |
| title_short | Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering |
| title_sort | probabilistic models over ordered partitions with applications in document ranking and collaborative filtering |
| url | http://hdl.handle.net/20.500.11937/17402 |