Environment Modules
Overview
Teaching: 35 min
Exercises: 5 minTopics
How to load modules to access software that I want to use for my research?
Objectives
Learn about modules, how to search, load and unload modules
Environment Variables
The shell, and many other command line programs uses a set of variables to control their behavior. Those variables are called Environment Variables. Think about them as placeholders for information stored within the system that passes data to programs launched in the shell.
Environment Variables control CLI functionality. They declare where to search for executable commands, where to search for libraries, which language display messages to you, how you prompt looks. Beyond the shell itself, Environment Variables are use by many codes to control their own operation.
You can see all the variables currently defined by executing:
$ env
Shell variables can be created like
$ A=10
$ B=20
Environment variables are shell variables that are exported, ie converted into global variables, the command to do this could be like:
$ A=10
$ B=20
$ export A
$ export B
Or simply:
$ export A=10
$ export B=20
Environment variables are similar to the shell variables that you can create of the shell. Shell variables can be used to store data and manipulated during the life of the shell session. However, only environment variables are visible by child processes created from that shell. To clarify this consider this script:
#!/bin/bash
echo A= $A
echo B= $B
C=$(( $A + $B ))
echo C= $C
Now create two shell variables and execute the script, do the same with environment variables and notice that now the script is able to see the variables.
Some common environment variables are:
Environment Variable | Description |
---|---|
$USER | Your username |
$HOME | The path to your home directory |
$PS1 | Your prompt |
$PATH | List of locations to search for executable commands |
$MANPATH | List of locations to search for manual pages |
$LD_LIBRARY_PATH | List of locations to search for libraries in runtime |
$LIBRARY_PATH | List of locations to search for libraries during compilation (actually during linking) |
Those are just a few environment variables of common use. There are many more. Changing them will change where executables are found, which libraries are used and how the system behaves in general. That is why managing the environment variables properly is so important on a machine, and even more on a HPC cluster, a machine that runs many different codes with different versions.
Here is where environment modules enters.
Environment Modules
The modules software package allows you to dynamically modify your user environment by using modulefiles.
Each modulefile contains the information needed to configure the shell for an application. After the modules software package is initialized, the environment can be modified on a per-module basis using the module command, which interprets modulefiles. Typically, modulefiles instruct the module command to alter or set shell environment variables such as PATH
, MANPATH
, and others. The modulefiles can be shared by many users on a system, and users can have their own collection to supplement or replace the shared modulefiles.
As a user, you can add and remove modulefiles from the current environment. The environment changes contained in a modulefile can also be summarized through the module show command. You are welcome to change modules in your .bashrc
or .cshrc
, but be aware that some modules print information (to standard error) when loaded, this should be directed to a file or /dev/null
when loaded in an initialization script.
Basic arguments
The following table lists the most common module command options
Command | Description |
---|---|
module list | Lists modules currently loaded in a user’s environment. |
module avail | Lists all available modules on a system. |
module show | Shows environment changes that will be made by loading a given module. |
module load | Loads a module. |
module unload | Unloads a module. |
module help | Shows help for a module. |
module swap | Swaps a currently loaded module for an unloaded module. |
Creating a private repository
The basic procedure is to locate modules on a folder accessible by relevant users and add the variable MODULEPATH
to your .bashrc
MODULEPATH
controls the path that the module command searches when looking for
modulefiles.
Typically, it is set to a default value by the bootstrap procedure.
MODULEPATH
can be set using ’module use’ or by the module initialization
script to search group or personal modulefile directories before or after
the master modulefile directory.
Exercise: Using modulefiles
Check the modules that you currently have and clean (purge) your environment from them. Check again and confirm that no module is loaded.
Check which versions of Python, R and GCC you have from the RHEL itself. Try to get and idea of how old those three components are. For python and R all that you have to do is enter the corresponding command (
R
orpython
). For GCC you need to usegcc --version
and see the date of those programs.Now lets get newer version of those 3 components by loading the corresponding modules. Search for the module for Python 3.7.2 and R 3.4.1 and GCC 8.2.0 and load the corresponding modules. To make things easier, you can use check the availability of modules just in the languages section.
module avail lang
Check again which version of those 3 components you have now. Notice that in the case of Python 3, the command python still goes towards the old python 2.6.6, as the python 3.x interpreter is not backwards compatible with python 2.x the new command is called
python3
, check its version by entering the command.Clean all of the environment
module purge
Go back and purge all the modules from your environment. We will now explore why it is important to use a recent compiler. Try to compile the code at
workshops_hands-on/Introduction_HPC/5._Environment_Modules/lambda_c++14.cpp
. Go to the folder and execute:g++ lambda_c++14.cpp
At this point you should have received a list of errors, that is because even if the code is C++ it uses elements of the language that were not present at that time on C++ Specification. The code actually uses C++14 and only recent versions of GCC allows for these declarations. Lets check how many GCC compilers we have available on Thorny Flat.
module avail lang/gcc
Now from that list, start loading and trying to compile the code as indicated above. Which versions of GCC allow you to compile the code? Also try the Intel compilers. In the case of intel the command to compile the code is
icpc lambda_c++14.cpp
Try with all the Intel compilers, it will fail with all of them. That is because the default standard for the Intel C++ compiler is not C++14, you need to declare it explicitly and only for Intel Compiler suite 17.0.1
icpc lambda_c++14.cpp -std=c++14
Now it should be clearer why modules is an important feature of any HPC infrastructure as it allows you to use several compilers, libraries and packages in different versions. On a normal computer, you usually have just one.
Reference: Modules on the clusters
This is the list of all modules on Thorny Flat on July 2020.
TIER 0 | TIER 1 | TIER 2 |
---|---|---|
benchmarks/hpl/2.3_gcc48 benchmarks/hpl/2.3_gcc82 dev/cmake/3.15.2 dev/cmake/3.15.4 dev/doxygen/1.8.15 lang/gcc/7.5.0 lang/gcc/8.2.0 lang/gcc/8.4.0 lang/gcc/9.3.0 lang/go/1.12.7 lang/intel/2018 lang/intel/2018_u4 lang/intel/2019 lang/intel/2019_u5 lang/java/jdk1.8.0_201 lang/julia/1.1.1 lang/julia/1.2.0 lang/pgi/19.10 lang/pgi/19.4 lang/python/cpython_3.6.9_gcc82 lang/python/cpython_3.7.2_gcc82 lang/python/cpython_3.7.4_gcc82 lang/python/intelpython_2.7.14 lang/python/intelpython_2.7.16 lang/python/intelpython_3.6.3 lang/python/intelpython_3.6.9 lang/python/pypy2.7-7.1.1-portable lang/python/pypy3.6-7.1.1-portable lang/python/pypy3.6-v7.1.1-thorny lang/r/3.5.2 lang/r/3.6.2 libs/atompaw/4.1.0.5_gcc48 libs/atompaw/4.1.0.5_intel18 libs/boost/1.70_gcc48_ompi216 libs/boost/1.70_gcc82_ompi216 libs/boost/1.70_intel18 libs/boost/1.73 libs/cfitsio/3.47_gcc82 libs/eigen/3.3.7 libs/fftw/3.3.8_gcc48 libs/fftw/3.3.8_gcc75 libs/fftw/3.3.8_gcc75_ompi3.1.6 libs/fftw/3.3.8_gcc82 libs/fftw/3.3.8_gcc82b libs/fftw/3.3.8_gcc82_ompi4 libs/fftw/3.3.8_gcc84 libs/fftw/3.3.8_gcc84_ompi3.1.6 libs/fftw/3.3.8_gcc93 libs/fftw/3.3.8_gcc93_ompi3.1.6 libs/fftw/3.3.8_intel18 libs/gmp/6.2.0 libs/hdf5/1.10.5_gcc48 libs/hdf5/1.10.5_gcc48_ompi31 libs/hdf5/1.10.5_gcc82 libs/hdf5/1.10.5_gcc82_ompi31 libs/hdf5/1.10.5_intel18 libs/hdf5/1.10.5_intel18_impi18 libs/hdf5/1.10.5_intel19 libs/hdf5/1.10.5_intel19_impi19 libs/hdf5/1.10.6_gcc82_ompi31 libs/hdf5/1.12.0_gcc75 libs/hdf5/1.12.0_gcc75_ompi31 libs/hdf5/1.12.0_gcc84 libs/hdf5/1.12.0_gcc84_ompi31 libs/hdf5/1.12.0_gcc93 libs/hdf5/1.12.0_gcc93_ompi31 libs/libpsml/1.1.7_gcc82 libs/libxc/3.0.1_gcc48 libs/libxc/3.0.1_gcc82 libs/libxc/3.0.1_intel18 libs/libxc/4.2.3_intel18 libs/libxc/4.3.4_gcc82 libs/libxc/4.3.4_intel18 libs/magma/2.5.1_gcc48 libs/netcdf/4.1.1_gcc48 libs/netcdf/4.7.1_gcc82 libs/netcdf/4.7.1_intel18 libs/netcdf/4.7.1_intel19 libs/netcdf/4.x_gcc48 libs/netcdf/4.x_gcc48_ompi2 libs/netcdf/4.x_gcc82 libs/netcdf/4.x_gcc82_ompi4 libs/netcdf/4.x_intel18 libs/netcdf/4.x_intel18_impi18 libs/netcdf/fortran-4.5.2_intel18 libs/netlib/3.8.0_gcc82 libs/netlib/3.8.0_intel18 libs/openblas/0.3.5_gcc48 libs/openblas/0.3.5_gcc82 libs/openblas/0.3.7_gcc82 libs/openblas/0.3.9_gcc75 libs/openblas/0.3.9_gcc84 libs/openblas/0.3.9_gcc93 libs/refblas/3.8_gcc82 libs/suitesparse/5.4.0_gcc82 libs/swig/4.0.1_gcc82 libs/xmlf90/1.5.4_gcc48 libs/xmlf90/1.5.4_gcc82 libs/yaml/0.2.2_gcc82 libs/zeromq/4.3.1_gcc82 parallel/cuda/10.0.130 parallel/hwloc/1.10.1_gcc48 parallel/hwloc/1.10.1_gcc82 parallel/hwloc/1.10.1_intel18 parallel/hwloc/1.11.13_gcc82 parallel/hwloc/2.0.3_gcc82 parallel/hwloc/2.0.3_intel18 parallel/impi/2017 parallel/mpich/3.3_gcc82 parallel/mvapich2/2.3.1_gcc82 parallel/openmpi/2.1.2_gcc48 parallel/openmpi/2.1.6_gcc48 parallel/openmpi/2.1.6_gcc82 parallel/openmpi/2.1.6_intel18 parallel/openmpi/3.1.4_gcc48 parallel/openmpi/3.1.4_gcc82 parallel/openmpi/3.1.4_intel18 parallel/openmpi/3.1.6_gcc75 parallel/openmpi/3.1.6_gcc84 parallel/openmpi/3.1.6_gcc93 parallel/ucx/1.5.0_gcc82 utils/tmux/3.0a |
conda matlab/2018b singularity/2.5.2 |
ansys/fluids_19.2 astronomy/casa/5.3.0 astronomy/casa/5.4.1 astronomy/casa/5.6.0 atomistic/abinit/8.10.2_intel18 atomistic/abinit/8.10.3_gcc82 atomistic/abinit/8.10.3_gcc82_mpiio atomistic/abinit/8.10.3_intel18 atomistic/abinit/9.0.4_gcc82 atomistic/amber/18_cuda atomistic/amber/18_mpi atomistic/amber/18_openmp atomistic/elk/5.2.14_intel18 atomistic/espresso/6.4_intel18_seq atomistic/espresso/6.4_intel18_thd atomistic/gaussian/g16 atomistic/gaussian/g16_rev1 atomistic/gromacs/2016.6 atomistic/gromacs/2016.6_cuda atomistic/gromacs/2016.6_gcc48_cuda atomistic/gromacs/2016.6_gcc82 atomistic/gromacs/2016.6_plumed_gcc82 atomistic/gromacs/2018.8_gcc82 atomistic/gromacs/2018.8_plumed_gcc82 atomistic/gromacs/2019.3 atomistic/gromacs/2019.3_gcc48_cuda atomistic/gromacs/2019.4 atomistic/gromacs/2019.4_double atomistic/gromacs/2019.4_gcc82 atomistic/gromacs/2019.4_plumed_gcc82 atomistic/gromacs/5.1.5_cuda atomistic/lammps/2018-12-12_gcc82 atomistic/lammps/2018-12-12_gcc82_ompi2 atomistic/lammps/2019.06.05 atomistic/lammps/2019.08.07_gcc82_ompi31 atomistic/lammps/2019.08.07_intel19_impi19 atomistic/namd/2.13_CPU atomistic/namd/2.13_CUDA atomistic/namd/NAMD_Git-2020-01-02-mpi atomistic/namd/NAMD_Git-2020-01-02-mpi-smp atomistic/namd/NAMD_Git-2020-01-02-ofi atomistic/namd/NAMD_Git-2020-01-02-ofi-smp atomistic/octopus/9.1_gcc82 atomistic/octopus/9.1_gcc82_ompi31 atomistic/orca/4.2.1_ompi216 atomistic/orca/4.2.1_ompi314 atomistic/plumed/2.5.3_gcc82 atomistic/siesta/4.0.2_intel18 atomistic/siesta/4.0.2_intel19 atomistic/vasp/5.4.4_intel18_seq atomistic/vasp/5.4.4_intel18_thd atomistic/vasp/5.4.4_intel19_seq atomistic/vasp/5.4.4_intel19_thd bioinformatics/emboss/6.6.0 bioinformatics/gatk/4.1.0 data/hdfview/3.1.0 math/dakota/6.10 math/dakota/6.10-UI math/dakota/6.8 math/dakota/6.8-UI math/gams/26.1 visual/graphviz/2.40.1_gcc82 visual/paraview/5.6.0 /shared/modulefiles/tier3: general_gcc82 general_intel18 jupyter_kernels r/3.5.2 r/3.6.2 |
Key Points
Use
module avail
to know all the modules on the cluster.Use
module load <module_name>
to load the module that you need.You can preload modules for each login by adding the load line on your
$HOME/.bashrc