Authors: Olli-Pekka Lehto, Ville Savolainen, Raimo Uusvuori, Arto Teräs
Date: 2005-11-14
Status: Draft. To be replaced by Sepeli/M-grid User's Guide. Changelog at the bottom of the page.
General information has been moved to a separate page.
Portland Group server suite (PGI) 5.2
C/C++: pgcc
Fortran 77 and 90/95: pgf77
, pgf90
Optimization flags: -O3 -fastsse
Static linking: -Bstatic
32-bit code: -tp k8-32 -Wa,--32
For more options, e.g., -Mipa=fast
, see man pages,
online documentation,
or the PGI documentation locally at /opt/pgi/linux86-64/5.2/doc/index.htm
.
gcc
g77
-O3 -funroll-all-loops -msse
Static linking: -static
-m32
CSC and some other sites have locally installed PathScale compilers.
Its f90
compiler is probably somewhat more efficient than that of Portland. You
may install locally PathScale test license for evaluation. Get 30 day test license.
pathcc
, pathCC
pathf90
Optimization flags: -O3 -OPT:Ofast
-static
-m32
For more options, e.g., -ipa
, see man pages or
/opt/pathscale/share/doc/pathscale-compilers-2.1/UserGuide.pdf
Both MPICH (currently version 1.2.6 in M-grid) and LAM/MPI (v.
7.0.6)
libraries
are installed. Both LAM/MPI and MPICH have had some problems with SGE,
leaving sometimes deamons hanging in the nodes after the
completion of a batch job. MPICH-2 is also installed and functional,
but not
tied to SGE. MPICH-1.2 is the officially (by CSC) supported
MPI library. (FIXME: status now?)
Before compilation and/or execution of a parallel MPI program, you
must initialize
the correct environment variables by running one of the init scripts
described below. This sets correct paths for mpicc, mpif77, mpif90,
mpirun etc.
MPICH with
PGI
Init script: 'source /opt/mpich/mpich-pgi64.sh'
(64-bit),
'source
/opt/mpich/mpich-pgi32.sh'
(32-bit)
Paths: /opt/mpich/pgi64/
and /opt/mpich/pgi32/
MPICH with
GNU
Init script: 'source /opt/mpich/mpich-gnu64.sh'
(64-bit),
'source
/opt/mpich/mpich-gnu32.sh'
(32-bit)
Paths: /opt/mpich/gnu64/
and
/opt/mpich/gnu32/
LAM/MPI with PGI
Init script: 'source /opt/lam/lam-pgi64.sh'
(64-bit), 'source
/opt/lam/lam-pgi32.sh'
(32-bit)
Paths: /opt/lam/pgi64/
and /opt/lam/pgi32/
LAM/MPI with GNU
Init script: 'source /opt/lam/lam-gnu64.sh'
(64-bit), 'source
/opt/lam/lam-gnu32.sh'
(32-bit)
Paths: /opt/lam/gnu64/
and
/opt/mpich/gnu32/
PathScale Suite is compatible with GNU compilers, so you may link them with MPI libraries which are compiled with them. For example,
pathf90 mpitest.f90 -I/opt/mpich/gnu64/include
-L/opt/mpich/gnu64/lib/ -lmpich
Alternatively, both MPI libraries are now configured and compiled also
for PathScale 64-bit environment and you may use similar init and mpi*
compilation scripts. In both
cases, the corresponding mpirun script may be used.
Parallel programs are recommended to be run as batch jobs, even test runs. Usage of SGE is described below.
MPICH-2
is
installed under /opt/mpich2/
, and similar compiler
dependent
subdirectories and scripts can be found for using it as for MPICH-1.2.
However,
MPICH-2 needs to
initialize (mpdboot
) and stop (mpdallexit
)
mpd deamons for each MPI job, and this is not tied to SGE configuration
yet.
ACML
(AMD Core Math Library) 2.0 is located in /opt/acml/
Atlas
(Automatically Tuned Linear Algebra Software) 3.5-12 is located in /usr/lib/
Both ACML and Atlas provide optimized BLAS and LAPACK libraries.
ACML includes, in addition, FFT routines. The ACML library is
specifically optimized for the AMD Opteron architecture and is
recommended.
Examples of compilation and linking:
pgf77 dgetrf_example.f
-O3 -fastsse -Bstatic
-Mcache_align -lacml -L/opt/acml/pgi64/lib
g77 dgetrf_example.f
-O3 -funroll-all-loops -msse -static -lacml -L/opt/acml/gnu64/lib
ACML is available in static and shared libraries. The one to be used is chosen either by compiler flags or explicitly.
The ACML documentation is available locally at /opt/acml/Doc
and
online.
Scalapack libraries are located in their respective MPI/compiler
version directories, e.g., 64-bit GNU/MPICH in /opt/scalapack/mpich-gnu64/
It's up to your local site's usage policy, whether you may do test runs, especially of a MPI program, interactively. The preferred method will be to use test queues of the batch system.
If interactive test usage is allowed, you can use mpirun directly,
either to execute the code locally at the front node or non-locally at
the
compute nodes. Note that the first option consumes the front node's
resources
and the latter interferes with the batch systems scheduling. Thus, the
following is discouraged and your local admins reserve the right to
kill your job:
First, run the correct init script (see above). In addition, for
LAM/MPI you also need to issue the command 'lamboot'
, if lamd
is not running. Then:
MPICH
mpirun -np [number of cpus] -nolocal -machinefile
[machinefile] [executable]
LAM/MPI
mpirun -np [number of cpus] [executable]
The machinefile is a list of the nodes used as a pool for your job.
NB. SGE offers a more refined way to run interactive test jobs, described below.
The queue management is handled by Sun Grid Engine
(SGE). All batch jobs and larger interactive jobs should
be sent via the queuing mechanism. See more detailed instructions on how to use
SGE on M-grid.
qstat, qstat -f - Queue status
qsub [job script file] - Submit a job into the queue
qdel [job id] - Remove a job from the queue
qrsh - Executes a command interactively on a free slot.
SGE job script samples for both serial and MPI jobs can be found in
the
directory /home/samples
. The following is an example of
a SGE script for a 4-processor 64-bit MPICH job, compiled with PGI
(f90).
#!/bin/sh
#$ -N identifier-for-my-job
#$ -cwd
#$ -j y
#$ -pe mpich 4
#$ -S /bin/bash
source /opt/mpich/mpich-pgi64.sh
echo "Got $NSLOTS processors."
echo "Machines:"
cat $TMPDIR/machines
mpirun -np $NSLOTS -machinefile $TMPDIR/machines /home/vsavolai/mpitest-exe
Currently the following parallel environment (keyword for -pe
)
and the MPICH library
should be used (for both 64- and 32-bit applications):
If you want to test or debug an MPI program, it might not be nice to do it on the front node. In addition, for debugging you naturally want interactive control.
Here's how you can ask for an interactive session on a
compute node via SGE:
qrsh
This gives you a random node with low load. Note that you can debug
also a
program running more than 2 processes on a 2P node.
Copy the program to be tested or debugged (in latter case, compiled
with the option -g
)
and all the files needed (input etc.) to a directory of your choice on
the node. Generate also a machinefile here by (note that the number 4
in the following specifies that you will be able to run the code with 4
processes):
echo `hostname`":4" > machines
Run the desired init script normally:
source /opt/mpich/mpich-pgi64.sh
Run the code (in the following, through the ddd debugger via the option
-dbg)
using the machinefile just created:
mpirun -machinefile machines -np 4 -dbg=ddd
the_full_path_and_name_of_your_exe &