Pipeline Collector: Gathering performance data for distributed astronomical pipelines

Modern astronomical data processing requires complex software pipelines to process ever growing datasets. For radio astronomy, these pipelines have become so large that they need to be distributed across a computational cluster. This makes it difficult to monitor the performance of each pipeline ste...

Full description

Bibliographic Details
Main Authors: Mechev, A., Plaat, A., Oonk, J., Intema, Hubertus, Röttgering, H.
Format: Journal Article
Published: 2018
Online Access:http://hdl.handle.net/20.500.11937/74864
_version_ 1848763395091726336
author Mechev, A.
Plaat, A.
Oonk, J.
Intema, Hubertus
Röttgering, H.
author_facet Mechev, A.
Plaat, A.
Oonk, J.
Intema, Hubertus
Röttgering, H.
author_sort Mechev, A.
building Curtin Institutional Repository
collection Online Access
description Modern astronomical data processing requires complex software pipelines to process ever growing datasets. For radio astronomy, these pipelines have become so large that they need to be distributed across a computational cluster. This makes it difficult to monitor the performance of each pipeline step. To gain insight into the performance of each step, a performance monitoring utility needs to be integrated with the pipeline execution. In this work we have developed such a utility and integrated it with the calibration pipeline of the Low Frequency Array, LOFAR, a leading radio telescope. We tested the tool by running the pipeline on several different compute platforms and collected the performance data. Based on this data, we make well informed recommendations on future hardware and software upgrades. The aim of these upgrades is to accelerate the slowest processing steps for this LOFAR pipeline. The pipeline_collector suite is open source and will be incorporated in future LOFAR pipelines to create a performance database for all LOFAR processing.
first_indexed 2025-11-14T11:02:46Z
format Journal Article
id curtin-20.500.11937-74864
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T11:02:46Z
publishDate 2018
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-748642020-07-31T06:57:19Z Pipeline Collector: Gathering performance data for distributed astronomical pipelines Mechev, A. Plaat, A. Oonk, J. Intema, Hubertus Röttgering, H. Modern astronomical data processing requires complex software pipelines to process ever growing datasets. For radio astronomy, these pipelines have become so large that they need to be distributed across a computational cluster. This makes it difficult to monitor the performance of each pipeline step. To gain insight into the performance of each step, a performance monitoring utility needs to be integrated with the pipeline execution. In this work we have developed such a utility and integrated it with the calibration pipeline of the Low Frequency Array, LOFAR, a leading radio telescope. We tested the tool by running the pipeline on several different compute platforms and collected the performance data. Based on this data, we make well informed recommendations on future hardware and software upgrades. The aim of these upgrades is to accelerate the slowest processing steps for this LOFAR pipeline. The pipeline_collector suite is open source and will be incorporated in future LOFAR pipelines to create a performance database for all LOFAR processing. 2018 Journal Article http://hdl.handle.net/20.500.11937/74864 10.1016/j.ascom.2018.06.005 fulltext
spellingShingle Mechev, A.
Plaat, A.
Oonk, J.
Intema, Hubertus
Röttgering, H.
Pipeline Collector: Gathering performance data for distributed astronomical pipelines
title Pipeline Collector: Gathering performance data for distributed astronomical pipelines
title_full Pipeline Collector: Gathering performance data for distributed astronomical pipelines
title_fullStr Pipeline Collector: Gathering performance data for distributed astronomical pipelines
title_full_unstemmed Pipeline Collector: Gathering performance data for distributed astronomical pipelines
title_short Pipeline Collector: Gathering performance data for distributed astronomical pipelines
title_sort pipeline collector: gathering performance data for distributed astronomical pipelines
url http://hdl.handle.net/20.500.11937/74864