DataLad makes data management and data distribution more accessible. To do that it stands on the shoulders of Git and Git-annex to deliver a decentralized system for data exchange. This includes automated ingestion of data from online portals, and exposing it in readily usable form as Git(-annex) repositories, so-called datasets. The actual data storage and permission management, however, remains with the original data providers.
At the moment, DataLad provides access to over 11TB of data in a variety of datasets from different resources (see It allows for efficient search, dataset(s) installation, data modifications and their tracking within Git version control, publication of datasets to http websites, S3, figshare, etc; reproducible computation, and other features.

data distribution, data management, data publication, data sharing, distributed version control


