Anomaly detection in large-scale data stream networks
This paper addresses the anomaly detection problem in large-scale data mining applications using residual subspace analysis. We are specifically concerned with situations where the full data cannot be practically obtained due to physical limitations such as low bandwidth, limited memory, storage, or...
| Main Authors: | , , , |
|---|---|
| Format: | Journal Article |
| Published: |
Springer
2014
|
| Subjects: | |
| Online Access: | http://hdl.handle.net/20.500.11937/43829 |
| _version_ | 1848756820627161088 |
|---|---|
| author | Pham, DucSon Venkatesh, S. Lazarescu, Mihai Budhaditya, S. |
| author_facet | Pham, DucSon Venkatesh, S. Lazarescu, Mihai Budhaditya, S. |
| author_sort | Pham, DucSon |
| building | Curtin Institutional Repository |
| collection | Online Access |
| description | This paper addresses the anomaly detection problem in large-scale data mining applications using residual subspace analysis. We are specifically concerned with situations where the full data cannot be practically obtained due to physical limitations such as low bandwidth, limited memory, storage, or computing power. Motivated by the recent compressed sensing (CS) theory, we suggest a framework wherein random projection can be used to obtained compressed data, addressing the scalability challenge. Our theoretical contribution shows that the spectral property of the CS data is approximately preserved under a such a projection and thus the performance of spectral-based methods for anomaly detection is almost equivalent to the case in which the raw data is completely available. Our second contribution is the construction of the framework to use this result and detect anomalies in the compressed data directly, thus circumventing the problems of data acquisition in large sensor networks. We have conducted extensive experiments to detect anomalies in network and surveillance applications on large datasets, including the benchmark PETS 2007 and 83 GB of real footage from three public train stations. Our results show that our proposed method is scalable, and importantly, its performance is comparable to conventional methods for anomaly detection when the complete data is available. |
| first_indexed | 2025-11-14T09:18:17Z |
| format | Journal Article |
| id | curtin-20.500.11937-43829 |
| institution | Curtin University Malaysia |
| institution_category | Local University |
| last_indexed | 2025-11-14T09:18:17Z |
| publishDate | 2014 |
| publisher | Springer |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | curtin-20.500.11937-438292019-02-19T05:35:06Z Anomaly detection in large-scale data stream networks Pham, DucSon Venkatesh, S. Lazarescu, Mihai Budhaditya, S. anomaly detection sensor network data spectral methods compressed sensing stream data processing residual subspace analysis random projection This paper addresses the anomaly detection problem in large-scale data mining applications using residual subspace analysis. We are specifically concerned with situations where the full data cannot be practically obtained due to physical limitations such as low bandwidth, limited memory, storage, or computing power. Motivated by the recent compressed sensing (CS) theory, we suggest a framework wherein random projection can be used to obtained compressed data, addressing the scalability challenge. Our theoretical contribution shows that the spectral property of the CS data is approximately preserved under a such a projection and thus the performance of spectral-based methods for anomaly detection is almost equivalent to the case in which the raw data is completely available. Our second contribution is the construction of the framework to use this result and detect anomalies in the compressed data directly, thus circumventing the problems of data acquisition in large sensor networks. We have conducted extensive experiments to detect anomalies in network and surveillance applications on large datasets, including the benchmark PETS 2007 and 83 GB of real footage from three public train stations. Our results show that our proposed method is scalable, and importantly, its performance is comparable to conventional methods for anomaly detection when the complete data is available. 2014 Journal Article http://hdl.handle.net/20.500.11937/43829 10.1007/s10618-012-0297-3 Springer fulltext |
| spellingShingle | anomaly detection sensor network data spectral methods compressed sensing stream data processing residual subspace analysis random projection Pham, DucSon Venkatesh, S. Lazarescu, Mihai Budhaditya, S. Anomaly detection in large-scale data stream networks |
| title | Anomaly detection in large-scale data stream networks |
| title_full | Anomaly detection in large-scale data stream networks |
| title_fullStr | Anomaly detection in large-scale data stream networks |
| title_full_unstemmed | Anomaly detection in large-scale data stream networks |
| title_short | Anomaly detection in large-scale data stream networks |
| title_sort | anomaly detection in large-scale data stream networks |
| topic | anomaly detection sensor network data spectral methods compressed sensing stream data processing residual subspace analysis random projection |
| url | http://hdl.handle.net/20.500.11937/43829 |