Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach

Large-scale analysis of EEG and other physiological measures promises new insights into brain processes and more accurate and robust brain–computer interface models. However, the absence of standardized vocabularies for annotating events in a machine understandable manner, the welter of collection-s...

Full description

Bibliographic Details
Main Authors: Bigdely-Shamlo, Nima, Makeig, Scott, Robbins, Kay A.
Format: Online
Language:English
Published: Frontiers Media S.A. 2016
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4782059/
id pubmed-4782059
recordtype oai_dc
spelling pubmed-47820592016-03-24 Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach Bigdely-Shamlo, Nima Makeig, Scott Robbins, Kay A. Neuroscience Large-scale analysis of EEG and other physiological measures promises new insights into brain processes and more accurate and robust brain–computer interface models. However, the absence of standardized vocabularies for annotating events in a machine understandable manner, the welter of collection-specific data organizations, the difficulty in moving data across processing platforms, and the unavailability of agreed-upon standards for preprocessing have prevented large-scale analyses of EEG. Here we describe a “containerized” approach and freely available tools we have developed to facilitate the process of annotating, packaging, and preprocessing EEG data collections to enable data sharing, archiving, large-scale machine learning/data mining and (meta-)analysis. The EEG Study Schema (ESS) comprises three data “Levels,” each with its own XML-document schema and file/folder convention, plus a standardized (PREP) pipeline to move raw (Data Level 1) data to a basic preprocessed state (Data Level 2) suitable for application of a large class of EEG analysis methods. Researchers can ship a study as a single unit and operate on its data using a standardized interface. ESS does not require a central database and provides all the metadata data necessary to execute a wide variety of EEG processing pipelines. The primary focus of ESS is automated in-depth analysis and meta-analysis EEG studies. However, ESS can also encapsulate meta-information for the other modalities such as eye tracking, that are increasingly used in both laboratory and real-world neuroimaging. ESS schema and tools are freely available at www.eegstudy.org and a central catalog of over 850 GB of existing data in ESS format is available at studycatalog.org. These tools and resources are part of a larger effort to enable data sharing at sufficient scale for researchers to engage in truly large-scale EEG analysis and data mining (BigEEG.org). Frontiers Media S.A. 2016-03-08 /pmc/articles/PMC4782059/ /pubmed/27014048 http://dx.doi.org/10.3389/fninf.2016.00007 Text en Copyright © 2016 Bigdely-Shamlo, Makeig and Robbins. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Bigdely-Shamlo, Nima
Makeig, Scott
Robbins, Kay A.
spellingShingle Bigdely-Shamlo, Nima
Makeig, Scott
Robbins, Kay A.
Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach
author_facet Bigdely-Shamlo, Nima
Makeig, Scott
Robbins, Kay A.
author_sort Bigdely-Shamlo, Nima
title Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach
title_short Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach
title_full Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach
title_fullStr Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach
title_full_unstemmed Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach
title_sort preparing laboratory and real-world eeg data for large-scale analysis: a containerized approach
description Large-scale analysis of EEG and other physiological measures promises new insights into brain processes and more accurate and robust brain–computer interface models. However, the absence of standardized vocabularies for annotating events in a machine understandable manner, the welter of collection-specific data organizations, the difficulty in moving data across processing platforms, and the unavailability of agreed-upon standards for preprocessing have prevented large-scale analyses of EEG. Here we describe a “containerized” approach and freely available tools we have developed to facilitate the process of annotating, packaging, and preprocessing EEG data collections to enable data sharing, archiving, large-scale machine learning/data mining and (meta-)analysis. The EEG Study Schema (ESS) comprises three data “Levels,” each with its own XML-document schema and file/folder convention, plus a standardized (PREP) pipeline to move raw (Data Level 1) data to a basic preprocessed state (Data Level 2) suitable for application of a large class of EEG analysis methods. Researchers can ship a study as a single unit and operate on its data using a standardized interface. ESS does not require a central database and provides all the metadata data necessary to execute a wide variety of EEG processing pipelines. The primary focus of ESS is automated in-depth analysis and meta-analysis EEG studies. However, ESS can also encapsulate meta-information for the other modalities such as eye tracking, that are increasingly used in both laboratory and real-world neuroimaging. ESS schema and tools are freely available at www.eegstudy.org and a central catalog of over 850 GB of existing data in ESS format is available at studycatalog.org. These tools and resources are part of a larger effort to enable data sharing at sufficient scale for researchers to engage in truly large-scale EEG analysis and data mining (BigEEG.org).
publisher Frontiers Media S.A.
publishDate 2016
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4782059/
_version_ 1613548568503648256