How to download Human Connectome Project data from Amazon Web Services
All public releases of data produced by the Human Connectome Project (HCP) are available online through the ConnectomeDB. Nevertheless, I’ve had trouble running the Aspera Connect browser plugin – that’s required to download data from the database – on my Ubuntu machine (it crashes whenever I want to initiate a download).
Fortunately, HCP has decided to start sharing all the data also through
Amazon Web Services (AWS), as part of the
AWS Public Data Sets program,
which makes the data available directly from the command line using
AWS Command Line Interface (awscli
).
In order to access the data using awscli
, you first need to create
your AWS credentials through the ConnectomeDB (the steps are described
here)
and then setup your awscli
accordingly (see the documentation
here).
In case you’re not familiar with the HCP directory structure, it will be very useful to read about it before trying to access and download the data. The directory structure is described in the reference manuals that come with every release and these are accessible on the HCP site under Documentation.
I usually create simple Bash scripts to fetch all the data that I need. The following code, for example, downloads the preprocessed diffusion data for 10 unrelated subjects (text file with the list of subjects):