Running on parallel machines

Parallel execution is strongly system- and installation-dependent. Typically one has to specify:

The last item is optional and is read by the code. The first and second items are machine- and installation-dependent, and may be different for interactive and batch execution.

Please note: Your machine might be configured so as to disallow interactive execution: if in doubt, ask your system administrator.


For illustration, here's how to run pw.x on 16 processors partitioned into 8 pools (2 processors each), for several typical cases. For convenience, we also give the corresponding values of PARA_PREFIX, PARA_POSTFIX to be used in running the examples distributed with Quantum-ESPRESSO (see section [*], ``Run examples'').

IBM SP machines,
batch:
pw.x -npool 8 < input

PARA_PREFIX="", PARA_POSTFIX="-npool 8"
This should also work interactively, with environment variables NPROC set to 16, MP_HOSTFILE set to the file containing a list of processors.
IBM SP machines,
interactive, using poe:
poe pw.x -procs 16 -npool 8 < input

PARA_PREFIX="poe", PARA_POSTFIX="-procs 16 -npool 8"
SGI Origin and PC clusters
using mpirun:
mpirun -np 16 pw.x -npool 8 < input

PARA_PREFIX="mpirun -np 16", PARA_POSTFIX="-npool 8"
PC clusters
using mpiexec:
mpiexec -n 16 pw.x -npool 8 < input

PARA_PREFIX="mpiexec -n 16", PARA_POSTFIX="-npool 8"
Cray T3E
(old):
mpprun -n 16 pw.x -npool 8 < input

PARA_PREFIX="mpprun -n 16", PARA_POSTFIX="-npool 8"

Note that each processor writes its own set of temporary wavefunction files during the calculation. If wf_collect=.true. (in namelist control), the final result is collected into a single file, whose format is independent on the number of processors; otherwise, one wavefunction file per processor is left on the disk. In the latter case, the files are readable only by a job running on the same number of processors and pools, and if all files are on a file system that is visible to all processors (i.e., you cannot use local scratch directories: there is presently no way to ensure that the distribution of processes on processors will follow the same pattern for different jobs).

Some implementations of the MPI library may have problems with input redirection in parallel. If this happens, use the option -in (or -inp or -input), followed by the input file name. Example: pw.x -in input -npool 4 > output.

Please note that all postprocessing codes not reading data files produced by pw.x -- that is, average.x, voronoy.x, dos.x -- the plotting codes plotrho.x, plotband.x, and all executables in pwtools/, should be executed on just one processor. Unpredictable results may follow if those codes are run on more than one processor.

The PWSCF Group - 2005-11-18