Using HDF5 for Fortran/Python data exchange

When dealing with large datasets, for all sorts of reasons relating to accuracy/performance/datasize we want to do any I/O in binary format.

Two languages I commonly use are Python and Fortran. Getting these two languages to be able to exchange information and talk to each other can be difficult. Binary saved data in Fortran depends on the compiler, whilst the np.save function in Python saves in a defined format, but extracting this data in Fortran is a pain.

A useful solution to this problem is to use a hierarchical data format, HDF.

HDF is especially useful as it allows one to save different types of data in a ‘folders and files’ format, within one file that can then be read in a platform-independent way. See here for summary of HDF advantages.

Installation

Download the tarball from HDF and unpack it in some directory, which we will refer to as HDF5_Dir.

cd HDF5_Dir
./configure --prefix = HDF5_Dir --enable-fortran
make
make check
make install

for complete details see full INSTALL_file

If this is all OK, you should be able to navigate to HDF5_Dir/bin and see a executable called h5fc (amongst others). This is the gfortran compiler that we will use when compiling with a view to use HDF5.

The final step is to add the path HDF5_Dir/bin to your bashrc file. To do this, simply open bashrc and add

export PATH = "path/to/dir/HDF5_Dir/bin:$PATH"

Save and reboot terminal.

Now the command

echo $PATH

should now show HDF5_Dir/bin in your PATH whilst the command

h5fc

should bring up a bunch of compilation options.

For example usage in Fortran + Python see my github repository