On all systems the parallel input preparation is done automatically. Details for the parallel installation are given in Section 3.2.1. The following keywords are necessary for all parallel runs:
$parallel_platform
architecture
$numprocs
number CPUs
Currently the following parallel platforms are supported:
SMP
are:
HP V-Class
, SP3-SMP
and HP S/X-Class
MPP
are:
SP3
and linuxcluster
cluster
are:
HP Cluster
and every platform that is not known by TURBOMOLE
SMP
, but here the server task is treated differently:
the MPI implementation on the SGIs would cause this task to request
too much CPU time otherwise.
$numprocs is the number of slaves, i.e. the number of nodes doing the parallel work. If you want to run mpgrad, $traloop has to be equal to or a multiple of $numprocs.
For very large parallel runs it may be impossible to allocate the scratch files in the working directory. In this case the $scratch files option can be specified; an example for a dscf run is given below. The scratch directory must be accessible from all nodes.
$scratch files dscf dens /home/dfs/cd00/cd03_dens dscf fock /home/dfs/cd00/cd03_fock dscf dfock /home/dfs/cd00/cd03_dfock dscf ddens /home/dfs/cd00/cd03_ddens dscf xsv /home/dfs/cd00/cd03_xsv dscf pulay /home/dfs/cd00/cd03_pulay dscf statistics /home/dfs/cd00/cd03_statistics dscf errvec /home/dfs/cd00/cd03_errvec dscf oldfock /home/dfs/cd00/cd03_oldfock dscf oneint /home/dfs/cd00/cd03_oneint
For all programs employing density functional theory (DFT)
(i.e. dscf/gradand ridft/rdgrad)
$pardft
can be specified:
$pardft tasksize=1000 memdiv=0
The tasksize
is the approximate
number of points in one DFT task (default: 1000) and memdiv
says whether the nodes are dedicated exclusively to your job (memdiv=1)
or not (default: memdiv=0).
For dscf and grad runs you need a parallel statistics file
which has to be generated in advance. The filename is specified
with
$2e-ints_shell_statistics file=DSCF-par-stat
or
$2e-ints'_shell_statistics file=GRAD-par-stat
respectively.
The statistics files have to be generated with a single node dscf or grad run. For a dscf statistics run one uses the keywords:
$statistics dscf parallel $2e-ints_shell_statistics file=DSCF-par-stat $parallel_parameters maxtask=400 maxdisk=0 dynamic_fraction=0.300000and for a grad statistics run:
$statistics grad parallel $2e-ints'_shell_statistics file=GRAD-par-stat $parallel_parameters maxtask=400
maxtask
is the maximum number of two-electron integral tasks,
maxdisk
defines the maximum task size with respect to mass storage
(MBytes) and
dynamic_fraction
is the fraction of two-electron integral tasks
which will be allocated dynamically.
For parallel grad and rdgrad runs one can also specify:
$grad_send_densThis means that the density matrix is computed by one node and distributed to the other nodes rather than computed by every slave.
In the parallel version of ridft, the first client reads in the keyword $ricore from the control file and uses the given memory for the additional RI matrices and for RI-integral storage. All other clients use the same amount of memory as the first client does, although they do not need to store any of those matrices. This leads to a better usage of the available memory per node. But in the case of a big number of auxiliary basis functions, the RI matrices may become bigger than the specified $ricore and all clients will use as much memory as those matrices would allocate even if that amount is much larger than the given memory. To omit this behaviour one can use:
$ricore_slave
integer
specifying the number of MBs that shall be used on each client.
For parallel jobex runs one has to specify all the parallel keywords needed for the different parts of the geometry optimization, i.e. those for dscf and grad, or those for ridft and rdgrad, or those for dscf and mpgrad.