Environment Modules

Overview

Teaching: 35 min
Exercises: 5 min
Topics
  • How to load modules to access software that I want to use for my research?

Objectives
  • Learn about modules, how to search, load and unload modules

Environment Variables

The shell, and many other command line programs uses a set of variables to control their behavior. Those variables are called Environment Variables. Think about them as placeholders for information stored within the system that passes data to programs launched in the shell.

Environment Variables control CLI functionality. They declare where to search for executable commands, where to search for libraries, which language display messages to you, how you prompt looks. Beyond the shell itself, Environment Variables are use by many codes to control their own operation.

You can see all the variables currently defined by executing:

$ env

Shell variables can be created like

$ A=10
$ B=20

Environment variables are shell variables that are exported, ie converted into global variables, the command to do this could be like:

$ A=10
$ B=20
$ export A
$ export B

Or simply:

$ export A=10
$ export B=20

Environment variables are similar to the shell variables that you can create of the shell. Shell variables can be used to store data and manipulated during the life of the shell session. However, only environment variables are visible by child processes created from that shell. To clarify this consider this script:

#!/bin/bash

echo A= $A
echo B= $B
C=$(( $A + $B ))
echo C= $C

Now create two shell variables and execute the script, do the same with environment variables and notice that now the script is able to see the variables.

Some common environment variables are:

Environment Variable Description
$USER Your username
$HOME The path to your home directory
$PS1 Your prompt
$PATH List of locations to search for executable commands
$MANPATH List of locations to search for manual pages
$LD_LIBRARY_PATH List of locations to search for libraries in runtime
$LIBRARY_PATH List of locations to search for libraries during compilation (actually during linking)

Those are just a few environment variables of common use. There are many more. Changing them will change where executables are found, which libraries are used and how the system behaves in general. That is why managing the environment variables properly is so important on a machine, and even more on a HPC cluster, a machine that runs many different codes with different versions.

Here is where environment modules enters.

Environment Modules

The modules software package allows you to dynamically modify your user environment by using modulefiles.

Each modulefile contains the information needed to configure the shell for an application. After the modules software package is initialized, the environment can be modified on a per-module basis using the module command, which interprets modulefiles. Typically, modulefiles instruct the module command to alter or set shell environment variables such as PATH, MANPATH, and others. The modulefiles can be shared by many users on a system, and users can have their own collection to supplement or replace the shared modulefiles.

As a user, you can add and remove modulefiles from the current environment. The environment changes contained in a modulefile can also be summarized through the module show command. You are welcome to change modules in your .bashrc or .cshrc, but be aware that some modules print information (to standard error) when loaded, this should be directed to a file or /dev/null when loaded in an initialization script.

Basic arguments

The following table lists the most common module command options

Command Description
module list Lists modules currently loaded in a user’s environment.
module avail Lists all available modules on a system.
module show Shows environment changes that will be made by loading a given module.
module load Loads a module.
module unload Unloads a module.
module help Shows help for a module.
module swap Swaps a currently loaded module for an unloaded module.

Creating a private repository

The basic procedure is to locate modules on a folder accessible by relevant users and add the variable MODULEPATH to your .bashrc

MODULEPATH controls the path that the module command searches when looking for modulefiles. Typically, it is set to a default value by the bootstrap procedure. MODULEPATH can be set using ’module use’ or by the module initialization script to search group or personal modulefile directories before or after the master modulefile directory.

Exercise: Using modulefiles

  1. Check the modules that you currently have and clean (purge) your environment from them. Check again and confirm that no module is loaded.

  2. Check which versions of Python, R and GCC you have from the RHEL itself. Try to get and idea of how old those three components are. For python and R all that you have to do is enter the corresponding command (R or python). For GCC you need to use gcc --version and see the date of those programs.

  3. Now lets get newer version of those 3 components by loading the corresponding modules. Search for the module for Python 3.7.2 and R 3.4.1 and GCC 8.2.0 and load the corresponding modules. To make things easier, you can use check the availability of modules just in the languages section.

    module avail lang
    
  4. Check again which version of those 3 components you have now. Notice that in the case of Python 3, the command python still goes towards the old python 2.6.6, as the python 3.x interpreter is not backwards compatible with python 2.x the new command is called python3, check its version by entering the command.

  5. Clean all of the environment

    module purge
    
  6. Go back and purge all the modules from your environment. We will now explore why it is important to use a recent compiler. Try to compile the code at workshops_hands-on/Introduction_HPC/5._Environment_Modules/lambda_c++14.cpp. Go to the folder and execute:

    g++ lambda_c++14.cpp
    

    At this point you should have received a list of errors, that is because even if the code is C++ it uses elements of the language that were not present at that time on C++ Specification. The code actually uses C++14 and only recent versions of GCC allows for these declarations. Lets check how many GCC compilers we have available on Thorny Flat.

    module avail lang/gcc
    

    Now from that list, start loading and trying to compile the code as indicated above. Which versions of GCC allow you to compile the code? Also try the Intel compilers. In the case of intel the command to compile the code is

    icpc lambda_c++14.cpp
    

    Try with all the Intel compilers, it will fail with all of them. That is because the default standard for the Intel C++ compiler is not C++14, you need to declare it explicitly and only for Intel Compiler suite 17.0.1

    icpc lambda_c++14.cpp -std=c++14
    

    Now it should be clearer why modules is an important feature of any HPC infrastructure as it allows you to use several compilers, libraries and packages in different versions. On a normal computer, you usually have just one.

Reference: Modules on the clusters

This is the list of all modules on Thorny Flat on July 2020.

TIER 0 TIER 1 TIER 2
benchmarks/hpl/2.3_gcc48
benchmarks/hpl/2.3_gcc82
dev/cmake/3.15.2
dev/cmake/3.15.4
dev/doxygen/1.8.15
lang/gcc/7.5.0
lang/gcc/8.2.0
lang/gcc/8.4.0
lang/gcc/9.3.0
lang/go/1.12.7
lang/intel/2018
lang/intel/2018_u4
lang/intel/2019
lang/intel/2019_u5
lang/java/jdk1.8.0_201
lang/julia/1.1.1
lang/julia/1.2.0
lang/pgi/19.10
lang/pgi/19.4
lang/python/cpython_3.6.9_gcc82
lang/python/cpython_3.7.2_gcc82
lang/python/cpython_3.7.4_gcc82
lang/python/intelpython_2.7.14
lang/python/intelpython_2.7.16
lang/python/intelpython_3.6.3
lang/python/intelpython_3.6.9
lang/python/pypy2.7-7.1.1-portable
lang/python/pypy3.6-7.1.1-portable
lang/python/pypy3.6-v7.1.1-thorny
lang/r/3.5.2
lang/r/3.6.2
libs/atompaw/4.1.0.5_gcc48
libs/atompaw/4.1.0.5_intel18
libs/boost/1.70_gcc48_ompi216
libs/boost/1.70_gcc82_ompi216
libs/boost/1.70_intel18
libs/boost/1.73
libs/cfitsio/3.47_gcc82
libs/eigen/3.3.7
libs/fftw/3.3.8_gcc48
libs/fftw/3.3.8_gcc75
libs/fftw/3.3.8_gcc75_ompi3.1.6
libs/fftw/3.3.8_gcc82
libs/fftw/3.3.8_gcc82b
libs/fftw/3.3.8_gcc82_ompi4
libs/fftw/3.3.8_gcc84
libs/fftw/3.3.8_gcc84_ompi3.1.6
libs/fftw/3.3.8_gcc93
libs/fftw/3.3.8_gcc93_ompi3.1.6
libs/fftw/3.3.8_intel18
libs/gmp/6.2.0
libs/hdf5/1.10.5_gcc48
libs/hdf5/1.10.5_gcc48_ompi31
libs/hdf5/1.10.5_gcc82
libs/hdf5/1.10.5_gcc82_ompi31
libs/hdf5/1.10.5_intel18
libs/hdf5/1.10.5_intel18_impi18
libs/hdf5/1.10.5_intel19
libs/hdf5/1.10.5_intel19_impi19
libs/hdf5/1.10.6_gcc82_ompi31
libs/hdf5/1.12.0_gcc75
libs/hdf5/1.12.0_gcc75_ompi31
libs/hdf5/1.12.0_gcc84
libs/hdf5/1.12.0_gcc84_ompi31
libs/hdf5/1.12.0_gcc93
libs/hdf5/1.12.0_gcc93_ompi31
libs/libpsml/1.1.7_gcc82
libs/libxc/3.0.1_gcc48
libs/libxc/3.0.1_gcc82
libs/libxc/3.0.1_intel18
libs/libxc/4.2.3_intel18
libs/libxc/4.3.4_gcc82
libs/libxc/4.3.4_intel18
libs/magma/2.5.1_gcc48
libs/netcdf/4.1.1_gcc48
libs/netcdf/4.7.1_gcc82
libs/netcdf/4.7.1_intel18
libs/netcdf/4.7.1_intel19
libs/netcdf/4.x_gcc48
libs/netcdf/4.x_gcc48_ompi2
libs/netcdf/4.x_gcc82
libs/netcdf/4.x_gcc82_ompi4
libs/netcdf/4.x_intel18
libs/netcdf/4.x_intel18_impi18
libs/netcdf/fortran-4.5.2_intel18
libs/netlib/3.8.0_gcc82
libs/netlib/3.8.0_intel18
libs/openblas/0.3.5_gcc48
libs/openblas/0.3.5_gcc82
libs/openblas/0.3.7_gcc82
libs/openblas/0.3.9_gcc75
libs/openblas/0.3.9_gcc84
libs/openblas/0.3.9_gcc93
libs/refblas/3.8_gcc82
libs/suitesparse/5.4.0_gcc82
libs/swig/4.0.1_gcc82
libs/xmlf90/1.5.4_gcc48
libs/xmlf90/1.5.4_gcc82
libs/yaml/0.2.2_gcc82
libs/zeromq/4.3.1_gcc82
parallel/cuda/10.0.130
parallel/hwloc/1.10.1_gcc48
parallel/hwloc/1.10.1_gcc82
parallel/hwloc/1.10.1_intel18
parallel/hwloc/1.11.13_gcc82
parallel/hwloc/2.0.3_gcc82
parallel/hwloc/2.0.3_intel18
parallel/impi/2017
parallel/mpich/3.3_gcc82
parallel/mvapich2/2.3.1_gcc82
parallel/openmpi/2.1.2_gcc48
parallel/openmpi/2.1.6_gcc48
parallel/openmpi/2.1.6_gcc82
parallel/openmpi/2.1.6_intel18
parallel/openmpi/3.1.4_gcc48
parallel/openmpi/3.1.4_gcc82
parallel/openmpi/3.1.4_intel18
parallel/openmpi/3.1.6_gcc75
parallel/openmpi/3.1.6_gcc84
parallel/openmpi/3.1.6_gcc93
parallel/ucx/1.5.0_gcc82
utils/tmux/3.0a
conda
matlab/2018b
singularity/2.5.2
ansys/fluids_19.2
astronomy/casa/5.3.0
astronomy/casa/5.4.1
astronomy/casa/5.6.0
atomistic/abinit/8.10.2_intel18
atomistic/abinit/8.10.3_gcc82
atomistic/abinit/8.10.3_gcc82_mpiio
atomistic/abinit/8.10.3_intel18
atomistic/abinit/9.0.4_gcc82
atomistic/amber/18_cuda
atomistic/amber/18_mpi
atomistic/amber/18_openmp
atomistic/elk/5.2.14_intel18
atomistic/espresso/6.4_intel18_seq
atomistic/espresso/6.4_intel18_thd
atomistic/gaussian/g16
atomistic/gaussian/g16_rev1
atomistic/gromacs/2016.6
atomistic/gromacs/2016.6_cuda
atomistic/gromacs/2016.6_gcc48_cuda
atomistic/gromacs/2016.6_gcc82
atomistic/gromacs/2016.6_plumed_gcc82
atomistic/gromacs/2018.8_gcc82
atomistic/gromacs/2018.8_plumed_gcc82
atomistic/gromacs/2019.3
atomistic/gromacs/2019.3_gcc48_cuda
atomistic/gromacs/2019.4
atomistic/gromacs/2019.4_double
atomistic/gromacs/2019.4_gcc82
atomistic/gromacs/2019.4_plumed_gcc82
atomistic/gromacs/5.1.5_cuda
atomistic/lammps/2018-12-12_gcc82
atomistic/lammps/2018-12-12_gcc82_ompi2
atomistic/lammps/2019.06.05
atomistic/lammps/2019.08.07_gcc82_ompi31
atomistic/lammps/2019.08.07_intel19_impi19
atomistic/namd/2.13_CPU
atomistic/namd/2.13_CUDA
atomistic/namd/NAMD_Git-2020-01-02-mpi
atomistic/namd/NAMD_Git-2020-01-02-mpi-smp
atomistic/namd/NAMD_Git-2020-01-02-ofi
atomistic/namd/NAMD_Git-2020-01-02-ofi-smp
atomistic/octopus/9.1_gcc82
atomistic/octopus/9.1_gcc82_ompi31
atomistic/orca/4.2.1_ompi216
atomistic/orca/4.2.1_ompi314
atomistic/plumed/2.5.3_gcc82
atomistic/siesta/4.0.2_intel18
atomistic/siesta/4.0.2_intel19
atomistic/vasp/5.4.4_intel18_seq
atomistic/vasp/5.4.4_intel18_thd
atomistic/vasp/5.4.4_intel19_seq
atomistic/vasp/5.4.4_intel19_thd
bioinformatics/emboss/6.6.0
bioinformatics/gatk/4.1.0
data/hdfview/3.1.0
math/dakota/6.10
math/dakota/6.10-UI
math/dakota/6.8
math/dakota/6.8-UI
math/gams/26.1
visual/graphviz/2.40.1_gcc82
visual/paraview/5.6.0
/shared/modulefiles/tier3:
general_gcc82
general_intel18
jupyter_kernels
r/3.5.2
r/3.6.2
     

Key Points

  • Use module avail to know all the modules on the cluster.

  • Use module load <module_name> to load the module that you need.

  • You can preload modules for each login by adding the load line on your $HOME/.bashrc