Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores
User-Item (U-I) matrix has been used as the dominant data infrastructure of Collaborative Filtering (CF). To reduce space consumption in runtime and storage, caused by data sparsity and growing need to accommodate side information in CF design, one needs to go beyond the UI Matrix. In this paper, we...
| Main Authors: | , , , , |
|---|---|
| Format: | Conference or Workshop Item |
| Language: | English |
| Published: |
2019
|
| Subjects: | |
| Online Access: | https://eprints.nottingham.ac.uk/60535/ |
| _version_ | 1848799774428364800 |
|---|---|
| author | Peng, Xiangjun Wang, Qingfeng Sun, Xu Gong, Chunye Wang, Yaohua |
| author_facet | Peng, Xiangjun Wang, Qingfeng Sun, Xu Gong, Chunye Wang, Yaohua |
| author_sort | Peng, Xiangjun |
| building | Nottingham Research Data Repository |
| collection | Online Access |
| description | User-Item (U-I) matrix has been used as the dominant data infrastructure of Collaborative Filtering (CF). To reduce space consumption in runtime and storage, caused by data sparsity and growing need to accommodate side information in CF design, one needs to go beyond the UI Matrix. In this paper, we took a case study of Succinct Representations in Collaborative Filtering, rather than using a U-I Matrix. Our key insight is to introduce Succinct Data Structures as a new infrastructure of CF. Towards this, we implemented a User-based K-Nearest-Neighbor CF prototype via Wavelet Tree, by first designing a Accessible Compressed Documents (ACD) to compress U-I data in Wavelet Tree, which is efficient in both storage and runtime. Then, we showed that ACD can be applied to develop an efficient intersection algorithm without decompression, by taking advantage of ACD’s characteristics. We evaluated our design on 1,000 cores of Tianhe-II supercomputer, with one of the largest public data set ml-20m. The results showed that our prototype could achieve 3.7 minutes on average to deliver the results. |
| first_indexed | 2025-11-14T20:41:00Z |
| format | Conference or Workshop Item |
| id | nottingham-60535 |
| institution | University of Nottingham Malaysia Campus |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-14T20:41:00Z |
| publishDate | 2019 |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | nottingham-605352020-05-08T01:05:40Z https://eprints.nottingham.ac.uk/60535/ Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores Peng, Xiangjun Wang, Qingfeng Sun, Xu Gong, Chunye Wang, Yaohua User-Item (U-I) matrix has been used as the dominant data infrastructure of Collaborative Filtering (CF). To reduce space consumption in runtime and storage, caused by data sparsity and growing need to accommodate side information in CF design, one needs to go beyond the UI Matrix. In this paper, we took a case study of Succinct Representations in Collaborative Filtering, rather than using a U-I Matrix. Our key insight is to introduce Succinct Data Structures as a new infrastructure of CF. Towards this, we implemented a User-based K-Nearest-Neighbor CF prototype via Wavelet Tree, by first designing a Accessible Compressed Documents (ACD) to compress U-I data in Wavelet Tree, which is efficient in both storage and runtime. Then, we showed that ACD can be applied to develop an efficient intersection algorithm without decompression, by taking advantage of ACD’s characteristics. We evaluated our design on 1,000 cores of Tianhe-II supercomputer, with one of the largest public data set ml-20m. The results showed that our prototype could achieve 3.7 minutes on average to deliver the results. 2019-03-12 Conference or Workshop Item PeerReviewed application/pdf en cc_by https://eprints.nottingham.ac.uk/60535/1/Succinct%20representations%20in%20collaborative%20filtering%20A%20case%20study%20using%20wavelet%20tree%20on%201%2C000%20cores.pdf Peng, Xiangjun, Wang, Qingfeng, Sun, Xu, Gong, Chunye and Wang, Yaohua (2019) Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores. In: 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), 5-7 Dec. 2019, Gold Coast, Australia, Australia. Succinct Data Structures; Collaborative Filtering; Supercomputing http://dx.doi.org/10.1109/PDCAT46702.2019.00083 10.1109/PDCAT46702.2019.00083 10.1109/PDCAT46702.2019.00083 10.1109/PDCAT46702.2019.00083 |
| spellingShingle | Succinct Data Structures; Collaborative Filtering; Supercomputing Peng, Xiangjun Wang, Qingfeng Sun, Xu Gong, Chunye Wang, Yaohua Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores |
| title | Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores |
| title_full | Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores |
| title_fullStr | Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores |
| title_full_unstemmed | Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores |
| title_short | Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores |
| title_sort | succinct representations in collaborative filtering: a case study using wavelet tree on 1,000 cores |
| topic | Succinct Data Structures; Collaborative Filtering; Supercomputing |
| url | https://eprints.nottingham.ac.uk/60535/ https://eprints.nottingham.ac.uk/60535/ https://eprints.nottingham.ac.uk/60535/ |