The design of a molecular dynamics trajectory database is presented as an example of the organization of large-scale dynamic distributed repositories for scientific data. Large scientific datasets are usually interpreted through reduced data calculated by analysis functions. This allows a database architecture in which the analyzed datasets, that are kept in addition to the raw datasets, are transferred to a database user. A flexible user interface with a well defined Application Program Interface (API) allows for a wide array of analysis functions and the incorporation of user defined functions is a critical part of the database design. An analysis function is executed only when the requested analysis result is not available from an earlier request. A prototype implementation used to gain initial practical experiences with performance and scalability is presented.
ASJC Scopus subject areas
- Hardware and Architecture
- Computer Networks and Communications