Install notes for NASA/Pleiades: Intel stack with MPI-SGI

Here we build using the recommended MPI-SGI environment, with Intel compilers. An initial Pleiades environment is pretty bare-bones. There are no modules, and your shell is likely a csh varient. To switch shells, send an e-mail to support@nas.nasa.gov; I’ll be using bash.

Then add the following to your .profile:

# Add your commands here to extend your PATH, etc.

module load mpi-sgi/mpt
module load comp-intel
module load git
module load openssl
module load emacs

export BUILD_HOME=$HOME/build

export PATH=$BUILD_HOME/bin:$BUILD_HOME:/$PATH  # Add private commands to PATH

export LD_LIBRARY_PATH=$BUILD_HOME/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/nasa/openssl/1.0.1h/lib/:$LD_LIBRARY_PATH

# proper wrappers for using Intel rather than GNU compilers,
# Thanks to Daniel Kokron at NASA.
export MPICC_CC=icc
export MPICXX_CXX=icpc

export CC=mpicc

#pathing for Dedalus
export LOCAL_PYTHON_VERSION=3.5.0
export LOCAL_NUMPY_VERSION=1.10.1
export LOCAL_SCIPY_VERSION=0.16.1
export LOCAL_HDF5_VERSION=1.8.15-patch1
export LOCAL_MERCURIAL_VERSION=3.6

export PYTHONPATH=$BUILD_HOME/dedalus:$PYTHONPATH
export MPI_PATH=$MPI_ROOT
export FFTW_PATH=$BUILD_HOME
export HDF5_DIR=$BUILD_HOME

# Pleaides workaround for QP errors 8/25/14 from NAS (only for MPI-SGI)
export MPI_USE_UD=true

Python stack

Here we use the recommended MPI-SGI compilers, rather than our own openmpi.

Building Python3

Create $BUILD_HOME and then proceed with downloading and installing Python-3.4:

cd $BUILD_HOME
wget https://www.python.org/ftp/python/$LOCAL_PYTHON_VERSION/Python-$LOCAL_PYTHON_VERSION.tgz --no-check-certificate
tar xzf Python-$LOCAL_PYTHON_VERSION.tgz
cd Python-$LOCAL_PYTHON_VERSION
./configure --prefix=$BUILD_HOME \
                     OPT="-w -vec-report0 -opt-report0" \
                     FOPT="-w -vec-report0 -opt-report0" \
                     CFLAGS="-mkl -O3 -ipo -axCORE-AVX2 -xSSE4.2 -fPIC" \
                     CPPFLAGS="-mkl -O3 -ipo -axCORE-AVX2 -xSSE4.2 -fPIC" \
                     F90FLAGS="-mkl -O3 -ipo -axCORE-AVX2 -xSSE4.2 -fPIC" \
                     CC=mpicc  CXX=mpicxx F90=mpif90  \
                     --with-cxx-main=mpicxx --with-gcc=mpicc \
                     LDFLAGS="-lpthread" \
                    --enable-shared --with-system-ffi

make
make install

The previous intel patch is no longer required.

Installing pip

Python 3.4 now automatically includes pip.

On Pleiades, you’ll need to edit .pip/pip.conf:

[global]
cert = /etc/ssl/certs/ca-bundle.trust.crt

You will now have pip3 installed in $BUILD_HOME/bin. You might try doing pip3 -V to confirm that pip3 is built against python 3.4. We will use pip3 throughout this documentation to remain compatible with systems (e.g., Mac OS) where multiple versions of python coexist.

We suggest doing the following immediately to suppress version warning messages:

pip3 install --upgrade pip

Installing mpi4py

This should be pip installed:

pip3 install mpi4py

version >=2.0.0 seem to play well with mpi-sgi.

Installing FFTW3

We build our own FFTW3:

 wget http://www.fftw.org/fftw-3.3.4.tar.gz
 tar -xzf fftw-3.3.4.tar.gz
 cd fftw-3.3.4

./configure --prefix=$BUILD_HOME \
                      CC=icc        CFLAGS="-O3 -axCORE-AVX2 -xSSE4.2" \
                      CXX=icpc CPPFLAGS="-O3 -axCORE-AVX2 -xSSE4.2" \
                      F77=ifort  F90FLAGS="-O3 -axCORE-AVX2 -xSSE4.2" \
                      MPICC=icc MPICXX=icpc \
                      LDFLAGS="-lmpi" \
                      --enable-shared \
                      --enable-mpi --enable-openmp --enable-threads
 make -j
 make install

It’s critical that you use mpicc as the C-compiler, etc. Otherwise the libmpich libraries are not being correctly linked into libfftw3_mpi.so and dedalus failes on fftw import.

Installing nose

Nose is useful for unit testing, especially in checking our numpy build:

pip3 install nose

Installing cython

This should just be pip installed:

pip3 install cython

Numpy and BLAS libraries

Numpy will be built against a specific BLAS library. On Pleiades we will build against the OpenBLAS libraries.

All of the intel patches, etc. are unnecessary in the gcc stack.

Building numpy against MKL

Now, acquire numpy (1.10.1):

cd $BUILD_HOME
wget http://sourceforge.net/projects/numpy/files/NumPy/$LOCAL_NUMPY_VERSION/numpy-$LOCAL_NUMPY_VERSION.tar.gz
tar -xvf numpy-$LOCAL_NUMPY_VERSION.tar.gz
cd numpy-$LOCAL_NUMPY_VERSION
wget http://dedalus-project.readthedocs.org/en/latest/_downloads/numpy_pleiades_intel_patch.tar
tar xvf numpy_pleiades_intel_patch.tar

This last step saves you from needing to hand edit two files in numpy/distutils; these are intelccompiler.py and fcompiler/intel.py. I’ve built a crude patch, numpy_pleiades_intel_patch.tar which is auto-deployed within the numpy-$LOCAL_NUMPY_VERSION directory by the instructions above. This will unpack and overwrite:

numpy/distutils/intelccompiler.py
numpy/distutils/fcompiler/intel.py
This differs from prior versions in that “-xhost” is replaced with

“-axCORE-AVX2 -xSSE4.2”. I think this could be handled more gracefully using a extra_compile_flag option in the site.cfg.

We’ll now need to make sure that numpy is building against the MKL libraries. Start by making a site.cfg file:

cp site.cfg.example site.cfg
emacs -nw site.cfg

Edit site.cfg in the [mkl] section; modify the library directory so that it correctly point to TACC’s $MKLROOT/lib/intel64/. With the modules loaded above, this looks like:

[mkl]
library_dirs = /nasa/intel/Compiler/2015.3.187/composer_xe_2015.3.187/mkl/lib/intel64/
include_dirs = /nasa/intel/Compiler/2015.3.187/composer_xe_2015.3.187/mkl/include
mkl_libs = mkl_rt
lapack_libs =

These are based on intels instructions for compiling numpy with ifort and they seem to work so far.

Then proceed with:

python3 setup.py config --compiler=intelem build_clib --compiler=intelem build_ext --compiler=intelem install

This will config, build and install numpy.

Test numpy install

Test that things worked with this executable script numpy_test_full. You can do this full-auto by doing:

wget http://dedalus-project.readthedocs.org/en/latest/_downloads/numpy_test_full
chmod +x numpy_test_full
./numpy_test_full

We succesfully link against fast BLAS and the test results look normal.

Python library stack

After numpy has been built we will proceed with the rest of our python stack.

Installing Scipy

Scipy is easier, because it just gets its config from numpy. Dong a pip install fails, so we’ll keep doing it the old fashioned way:

wget http://sourceforge.net/projects/scipy/files/scipy/$LOCAL_SCIPY_VERSION/scipy-$LOCAL_SCIPY_VERSION.tar.gz
tar -xvf scipy-$LOCAL_SCIPY_VERSION.tar.gz
cd scipy-$LOCAL_SCIPY_VERSION
python3 setup.py config --compiler=intelem --fcompiler=intelem build_clib \
                                        --compiler=intelem --fcompiler=intelem build_ext \
                                        --compiler=intelem --fcompiler=intelem install

Note

We do not have umfpack; we should address this moving forward, but for now I will defer that to a later day.

Installing matplotlib

This should just be pip installed. In versions of matplotlib>1.3.1, Qhull has a compile error if the C compiler is used rather than C++, so we force the C complier to be icpc

export CC=icpc
pip3 install matplotlib

Installing HDF5 with parallel support

The new analysis package brings HDF5 file writing capbaility. This needs to be compiled with support for parallel (mpi) I/O. Intel compilers are failing on this when done with mpi-sgi, and on NASA’s recommendation we’re falling back to gcc for this library:

export MPICC_CC=
export MPICXX_CXX=
wget http://www.hdfgroup.org/ftp/HDF5/releases/hdf5-$LOCAL_HDF5_VERSION/src/hdf5-$LOCAL_HDF5_VERSION.tar.gz
tar xzvf hdf5-$LOCAL_HDF5_VERSION.tar.gz
cd hdf5-$LOCAL_HDF5_VERSION
./configure --prefix=$BUILD_HOME CC=mpicc CXX=mpicxx F77=mpif90 \
                    --enable-shared --enable-parallel
make
make install

H5PY via pip

This can now just be pip installed (>=2.6.0):

pip3 install h5py

For now we drop our former instructions on attempting to install parallel h5py with collectives. See the repo history for those notes.

Installing Mercurial

On NASA Pleiades, we need to install mercurial itself. I can’t get mercurial to build properly on intel compilers, so for now use gcc:

cd $BUILD_HOME
wget http://mercurial.selenic.com/release/mercurial-$LOCAL_MERCURIAL_VERSION.tar.gz
tar xvf mercurial-$LOCAL_MERCURIAL_VERSION.tar.gz
cd mercurial-$LOCAL_MERCURIAL_VERSION
module load gcc
export CC=gcc
make install PREFIX=$BUILD_HOME

I suggest you add the following to your ~/.hgrc:

[ui]
username = <your bitbucket username/e-mail address here>
editor = emacs

[web]
cacerts = /etc/ssl/certs/ca-bundle.crt

[extensions]
graphlog =
color =
convert =
mq =

Dedalus

Preliminaries

Then do the following:

cd $BUILD_HOME
hg clone https://bitbucket.org/dedalus-project/dedalus
cd dedalus
pip3 install -r requirements.txt
python3 setup.py build_ext --inplace

Running Dedalus on Pleiades

Our scratch disk system on Pleiades is /nobackup/user-name. On this and other systems, I suggest soft-linking your scratch directory to a local working directory in home; I uniformly call mine workdir:

ln -s /nobackup/bpbrown workdir

Long-term mass storage is on LOU.