Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores

User-Item (U-I) matrix has been used as the dominant data infrastructure of Collaborative Filtering (CF). To reduce space consumption in runtime and storage, caused by data sparsity and growing need to accommodate side information in CF design, one needs to go beyond the UI Matrix. In this paper, we...

Full description

Bibliographic Details
Main Authors: Peng, Xiangjun, Wang, Qingfeng, Sun, Xu, Gong, Chunye, Wang, Yaohua
Format: Conference or Workshop Item
Language:English
Published: 2019
Subjects:
Online Access:https://eprints.nottingham.ac.uk/60535/
_version_ 1848799774428364800
author Peng, Xiangjun
Wang, Qingfeng
Sun, Xu
Gong, Chunye
Wang, Yaohua
author_facet Peng, Xiangjun
Wang, Qingfeng
Sun, Xu
Gong, Chunye
Wang, Yaohua
author_sort Peng, Xiangjun
building Nottingham Research Data Repository
collection Online Access
description User-Item (U-I) matrix has been used as the dominant data infrastructure of Collaborative Filtering (CF). To reduce space consumption in runtime and storage, caused by data sparsity and growing need to accommodate side information in CF design, one needs to go beyond the UI Matrix. In this paper, we took a case study of Succinct Representations in Collaborative Filtering, rather than using a U-I Matrix. Our key insight is to introduce Succinct Data Structures as a new infrastructure of CF. Towards this, we implemented a User-based K-Nearest-Neighbor CF prototype via Wavelet Tree, by first designing a Accessible Compressed Documents (ACD) to compress U-I data in Wavelet Tree, which is efficient in both storage and runtime. Then, we showed that ACD can be applied to develop an efficient intersection algorithm without decompression, by taking advantage of ACD’s characteristics. We evaluated our design on 1,000 cores of Tianhe-II supercomputer, with one of the largest public data set ml-20m. The results showed that our prototype could achieve 3.7 minutes on average to deliver the results.
first_indexed 2025-11-14T20:41:00Z
format Conference or Workshop Item
id nottingham-60535
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T20:41:00Z
publishDate 2019
recordtype eprints
repository_type Digital Repository
spelling nottingham-605352020-05-08T01:05:40Z https://eprints.nottingham.ac.uk/60535/ Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores Peng, Xiangjun Wang, Qingfeng Sun, Xu Gong, Chunye Wang, Yaohua User-Item (U-I) matrix has been used as the dominant data infrastructure of Collaborative Filtering (CF). To reduce space consumption in runtime and storage, caused by data sparsity and growing need to accommodate side information in CF design, one needs to go beyond the UI Matrix. In this paper, we took a case study of Succinct Representations in Collaborative Filtering, rather than using a U-I Matrix. Our key insight is to introduce Succinct Data Structures as a new infrastructure of CF. Towards this, we implemented a User-based K-Nearest-Neighbor CF prototype via Wavelet Tree, by first designing a Accessible Compressed Documents (ACD) to compress U-I data in Wavelet Tree, which is efficient in both storage and runtime. Then, we showed that ACD can be applied to develop an efficient intersection algorithm without decompression, by taking advantage of ACD’s characteristics. We evaluated our design on 1,000 cores of Tianhe-II supercomputer, with one of the largest public data set ml-20m. The results showed that our prototype could achieve 3.7 minutes on average to deliver the results. 2019-03-12 Conference or Workshop Item PeerReviewed application/pdf en cc_by https://eprints.nottingham.ac.uk/60535/1/Succinct%20representations%20in%20collaborative%20filtering%20A%20case%20study%20using%20wavelet%20tree%20on%201%2C000%20cores.pdf Peng, Xiangjun, Wang, Qingfeng, Sun, Xu, Gong, Chunye and Wang, Yaohua (2019) Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores. In: 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), 5-7 Dec. 2019, Gold Coast, Australia, Australia. Succinct Data Structures; Collaborative Filtering; Supercomputing http://dx.doi.org/10.1109/PDCAT46702.2019.00083 10.1109/PDCAT46702.2019.00083 10.1109/PDCAT46702.2019.00083 10.1109/PDCAT46702.2019.00083
spellingShingle Succinct Data Structures; Collaborative Filtering; Supercomputing
Peng, Xiangjun
Wang, Qingfeng
Sun, Xu
Gong, Chunye
Wang, Yaohua
Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores
title Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores
title_full Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores
title_fullStr Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores
title_full_unstemmed Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores
title_short Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores
title_sort succinct representations in collaborative filtering: a case study using wavelet tree on 1,000 cores
topic Succinct Data Structures; Collaborative Filtering; Supercomputing
url https://eprints.nottingham.ac.uk/60535/
https://eprints.nottingham.ac.uk/60535/
https://eprints.nottingham.ac.uk/60535/