Data and Compute Resources
Ideas
An important question for open science is how to best host data. Do we rely on centralized, currated databases? Do we provide common interfaces and rely on groups to host their own materials? How do we make sure our pipelines can be run by anyone who wishes to reproduce our results?
These questions are closely related to those of complete ecosystems, where data hosting and compute are integrated with software systems. A different approach is to rely on seperate services for these needs. As always, the solution will likely be a mixture of these approaches. This page lists data hosting and compute resources which fall into the latter category.
- "geometry of needs and challenges in publishing data" twitter
Resources
Disitributed systems
Centralized
- figshare
- Centralized OA for data and manuscripts (w/ or w/o peer review)
- data dryad
- Amazon EC22 and S3: amazon
- Cloud hosting for compute or data ccess
- Dataverse
- XSEDE
- HPC resources for scientists (apply for compute time)