Proteogenomic gene structure validation in the pineapple genome

MD2 pineapple (Ananas comosus) is the second most important tropical crop that preserves crassulacean acid metabolism (CAM), which has high water-use efficiency and is fast becoming the most consumed fresh fruit worldwide. Despite the significance of environmental efficiency and popularity, until ve...

Full description

Bibliographic Details
Main Authors: Ariffin, Norazrin, Newman, David Wells, O’cualain, Ronan, Nelson, Michael G., Hubbard, Simon J.
Format: Article
Language:English
Published: American Chemical Society
Online Access:http://psasir.upm.edu.my/id/eprint/116163/
http://psasir.upm.edu.my/id/eprint/116163/1/116163.pdf
_version_ 1848866938355187712
author Ariffin, Norazrin
Newman, David Wells
O’cualain, Ronan
Nelson, Michael G.
Hubbard, Simon J.
author_facet Ariffin, Norazrin
Newman, David Wells
O’cualain, Ronan
Nelson, Michael G.
Hubbard, Simon J.
author_sort Ariffin, Norazrin
building UPM Institutional Repository
collection Online Access
description MD2 pineapple (Ananas comosus) is the second most important tropical crop that preserves crassulacean acid metabolism (CAM), which has high water-use efficiency and is fast becoming the most consumed fresh fruit worldwide. Despite the significance of environmental efficiency and popularity, until very recently, its genome sequence has not been determined and a high-quality annotated proteome has not been available. Here, we have undertaken a pilot proteogenomic study, analyzing the proteome of MD2 pineapple leaves using liquid chromatography-mass spectrometry (LC-MS/MS), which validates 1781 predicted proteins in the annotated F153 (V3) genome. In addition, a further 603 peptide identifications are found that map exclusively to an independent MD2 transcriptome-derived database but are not found in the standard F153 (V3) annotated proteome. Peptide identifications derived from these MD2 transcripts are also cross-referenced to a more recent and complete MD2 genome annotation, resulting in 402 nonoverlapping peptides, which in turn support 30 high-quality gene candidates novel to both pineapple genomes. Many of the validated F153 (V3) genes are also supported by an independent proteomics data set collected for an ornamental pineapple variety. The contigs and peptides have been mapped to the current F153 genome build and are available as bed files to display a custom gene track on the Ensembl Plants region viewer. These analyses add to the knowledge of experimentally validated pineapple genes and demonstrate the utility of transcript-derived proteomics to discover both novel genes and genetic structure in a plant genome, adding value to its annotation.
first_indexed 2025-11-15T14:28:33Z
format Article
id upm-116163
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T14:28:33Z
publisher American Chemical Society
recordtype eprints
repository_type Digital Repository
spelling upm-1161632025-03-19T08:32:21Z http://psasir.upm.edu.my/id/eprint/116163/ Proteogenomic gene structure validation in the pineapple genome Ariffin, Norazrin Newman, David Wells O’cualain, Ronan Nelson, Michael G. Hubbard, Simon J. MD2 pineapple (Ananas comosus) is the second most important tropical crop that preserves crassulacean acid metabolism (CAM), which has high water-use efficiency and is fast becoming the most consumed fresh fruit worldwide. Despite the significance of environmental efficiency and popularity, until very recently, its genome sequence has not been determined and a high-quality annotated proteome has not been available. Here, we have undertaken a pilot proteogenomic study, analyzing the proteome of MD2 pineapple leaves using liquid chromatography-mass spectrometry (LC-MS/MS), which validates 1781 predicted proteins in the annotated F153 (V3) genome. In addition, a further 603 peptide identifications are found that map exclusively to an independent MD2 transcriptome-derived database but are not found in the standard F153 (V3) annotated proteome. Peptide identifications derived from these MD2 transcripts are also cross-referenced to a more recent and complete MD2 genome annotation, resulting in 402 nonoverlapping peptides, which in turn support 30 high-quality gene candidates novel to both pineapple genomes. Many of the validated F153 (V3) genes are also supported by an independent proteomics data set collected for an ornamental pineapple variety. The contigs and peptides have been mapped to the current F153 genome build and are available as bed files to display a custom gene track on the Ensembl Plants region viewer. These analyses add to the knowledge of experimentally validated pineapple genes and demonstrate the utility of transcript-derived proteomics to discover both novel genes and genetic structure in a plant genome, adding value to its annotation. American Chemical Society Article PeerReviewed text en cc_by_4 http://psasir.upm.edu.my/id/eprint/116163/1/116163.pdf Ariffin, Norazrin and Newman, David Wells and O’cualain, Ronan and Nelson, Michael G. and Hubbard, Simon J. Proteogenomic gene structure validation in the pineapple genome. Journal of Proteome Research, 23 (5). 1583 - 1592.. 10.1021/acs.jproteome.3c00675.s003
spellingShingle Ariffin, Norazrin
Newman, David Wells
O’cualain, Ronan
Nelson, Michael G.
Hubbard, Simon J.
Proteogenomic gene structure validation in the pineapple genome
title Proteogenomic gene structure validation in the pineapple genome
title_full Proteogenomic gene structure validation in the pineapple genome
title_fullStr Proteogenomic gene structure validation in the pineapple genome
title_full_unstemmed Proteogenomic gene structure validation in the pineapple genome
title_short Proteogenomic gene structure validation in the pineapple genome
title_sort proteogenomic gene structure validation in the pineapple genome
url http://psasir.upm.edu.my/id/eprint/116163/
http://psasir.upm.edu.my/id/eprint/116163/
http://psasir.upm.edu.my/id/eprint/116163/1/116163.pdf