Encouraging collaboration through a new data management approach
Abstract
The ability to store large volumes of data is
increasing faster than processing power. Some existing data management
methods often result in data loss, inaccessibility or repetition of
simulations. We propose a framework which promotes collaboration and
simplifies data management.
In particular we have demonstrated the
proposed framework in the scenario of handling large scale data
generated from biomolecular simulations in a multiinstitutional global
collaboration. The framework has extended the ability of the Python
problem solving environment to manage data files and metadata
associated with simulations. We provide a transparent and seamless
environment for user submitted code to analyse and post-process data
stored in the framework. Based on this scenario we have further
enhanced and extended the framework to deal with the more generic case
of enabling any existing data file to be post processed from any .NET
enabled programming language.
Sources and further information
Bibtex
ePrints
Download (PDF)