Extracting reusable document components for variable data printing

Variable Data Printing (VDP) has brought new flexibility and dynamism to the printed page. Each printed instance of a specific class of document can now have different degrees of customized content within the document template. This flexibility comes at a cost. If every printed page is potentiall...

Full description

Bibliographic Details
Main Authors:	Bagley, Steven R., Brailsford, David F., Ollis, James A.
Format:	Conference or Workshop Item
Published:	2007
Subjects:	PostScript PDF SVG graphic objects Content Extraction Variable Data Printing.
Online Access:	https://eprints.nottingham.ac.uk/931/

_version_	1848790505070002176
author	Bagley, Steven R. Brailsford, David F. Ollis, James A.
author_facet	Bagley, Steven R. Brailsford, David F. Ollis, James A.
author_sort	Bagley, Steven R.
building	Nottingham Research Data Repository
collection	Online Access
description	Variable Data Printing (VDP) has brought new flexibility and dynamism to the printed page. Each printed instance of a specific class of document can now have different degrees of customized content within the document template. This flexibility comes at a cost. If every printed page is potentially different from all others it must be rasterized separately, which is a time-consuming process. Technologies such as PPML (Personalized Print Markup Language) attempt to address this problem by dividing the bitmapped page into components that can be cached at the raster level, thereby speeding up the generation of page instances. A large number of documents are stored in Page Description Languages at a higher level of abstraction than the bitmapped page. Much of this content could be reused within a VDP environment provided that separable document components can be identified and extracted. These components then need to be individually rasterisable so that each high-level component can be related to its low-level (bitmap) equivalent. Unfortunately, the unstructured nature of most Page Description Languages makes it difficult to extract content easily. This paper outlines the problems encountered in extracting component-based content from existing page description formats, such as PostScript, PDF and SVG, and how the differences between the formats affects the ease with which content can be extracted. The techniques are illustrated with reference to a tool called COG Extractor, which extracts content from PDF and SVG and prepares it for reuse.
first_indexed	2025-11-14T18:13:41Z
format	Conference or Workshop Item
id	nottingham-931
institution	University of Nottingham Malaysia Campus
institution_category	Local University
last_indexed	2025-11-14T18:13:41Z
publishDate	2007
recordtype	eprints
repository_type	Digital Repository
spelling	nottingham-9312020-05-04T20:28:38Z https://eprints.nottingham.ac.uk/931/ Extracting reusable document components for variable data printing Bagley, Steven R. Brailsford, David F. Ollis, James A. Variable Data Printing (VDP) has brought new flexibility and dynamism to the printed page. Each printed instance of a specific class of document can now have different degrees of customized content within the document template. This flexibility comes at a cost. If every printed page is potentially different from all others it must be rasterized separately, which is a time-consuming process. Technologies such as PPML (Personalized Print Markup Language) attempt to address this problem by dividing the bitmapped page into components that can be cached at the raster level, thereby speeding up the generation of page instances. A large number of documents are stored in Page Description Languages at a higher level of abstraction than the bitmapped page. Much of this content could be reused within a VDP environment provided that separable document components can be identified and extracted. These components then need to be individually rasterisable so that each high-level component can be related to its low-level (bitmap) equivalent. Unfortunately, the unstructured nature of most Page Description Languages makes it difficult to extract content easily. This paper outlines the problems encountered in extracting component-based content from existing page description formats, such as PostScript, PDF and SVG, and how the differences between the formats affects the ease with which content can be extracted. The techniques are illustrated with reference to a tool called COG Extractor, which extracts content from PDF and SVG and prepares it for reuse. 2007 Conference or Workshop Item PeerReviewed Bagley, Steven R., Brailsford, David F. and Ollis, James A. (2007) Extracting reusable document components for variable data printing. In: ACM Symposium on Document Engineering, 29-31 August 2007, Winnipeg, Canada. PostScript PDF SVG graphic objects Content Extraction Variable Data Printing. http://doi.acm.org/10.1145/1284420.1284435
spellingShingle	PostScript PDF SVG graphic objects Content Extraction Variable Data Printing. Bagley, Steven R. Brailsford, David F. Ollis, James A. Extracting reusable document components for variable data printing
title	Extracting reusable document components for variable data printing
title_full	Extracting reusable document components for variable data printing
title_fullStr	Extracting reusable document components for variable data printing
title_full_unstemmed	Extracting reusable document components for variable data printing
title_short	Extracting reusable document components for variable data printing
title_sort	extracting reusable document components for variable data printing
topic	PostScript PDF SVG graphic objects Content Extraction Variable Data Printing.
url	https://eprints.nottingham.ac.uk/931/ https://eprints.nottingham.ac.uk/931/

Extracting reusable document components for variable data printing

Similar Items