This is the README file for the QuantumESPRESSO application benchmark,
distributed with the DEISA Benchmark Suite:
http://www.deisa.eu/science/benchmarking/
Last modified by the DEISA Benchmark Team on 2008-09-09.
----------------------
QuantumESPRESSO readme
----------------------
Contents
-------
1 General description
2 Code structure
3 Parallelisation
4 Building
5 Running the code
6 Input and output data
1 General description
=====================
Quantum ESPRESSO is an integrated suite of computer codes for electronic-
structure calculations and materials modeling at the nanoscale. It is based on
density-functional theory, plane waves, and pseudopotentials (both norm-
conserving and ultrasoft).
Quantum ESPRESSO stands for opEn Source Package for Research in Electronic
Structure, Simulation, and Optimization. It is freely available to researchers
around the world under the terms of the GNU General Public License.
Quantum ESPRESSO builds onto newly-restructured electronic-structure codes
(PWscf, PHONON, CP90, FPMD, Wannier) that have been developed and tested by
some of the original authors of novel electronic-structure algorithms - from
Car-Parrinello molecular dynamics to density-functional perturbation theory -
and applied in the last twenty years by some of the leading materials modeling
groups worldwide. Innovation and efficiency is still our main focus.
Quantum ESPRESSO is evolving towards a distribution of independent and inter-
operable codes in the spirit of an open-source project. Researchers active in
the field of electronic-structure calculations are encouraged to participate
in the project by contributing their own codes or by implementing their own
ideas into existing codes.
Quantum ESPRESSO is an initiative of the DEMOCRITOS National Simulation Center
(Trieste) and of its partners, in collaboration with the CINECA National
Supercomputing Center in Bologna, the Ecole Polytechnique Federale de
Lausanne, Princeton University, and the Massachusetts Institute of
Technology. Courses on modern electronic-structure theory with hands-on
tutorials on the Quantum ESPRESSO codes are offered on a regular basis in
developed as well as developing countries, in collaboration with the Abdus
Salam International Centre for Theoretical Physics in Trieste.
From the codes available in the QuantumESPRESSO package the DEISA benchmark
uses PWscf, and the version of QuantumESPRESSO used is 4.0. For more
information on QuantumESPRESSO, see the code's website at
http://www.quantum-espresso.com/.
2 Code structure
================
The DEISA benchmark "QuantumESPRESSO" uses the PWscf code, included in the
QuantumESPRESSO package. PWscf computes ground state total energy of a system
of atoms and can perform different kind of ionic degrees of freedom phase
space sampling and energy minimisation such as Born-Oppenheimer Molecular
Dynamics or BFGS energy minimisation. The schema of the computations
implemented by the code is as follows:
read input or restart file
initialise the computation and distribute ions between MPI tasks
start ion cycle
start iterative energy minimisation (SCF iteration)
compute potentials from the charge density
diagonalise the Hamiltonian to find wave functions
compute the new charge density from the wave functions
end iterative energy minimisation
update ion's position
update the geometry of the system
end ion cycle
dump relevant physical quantities on the disk
3 Parallelisation
=================
QuantumESPRESSO codes are parallelised purely with MPI. Data structures are
distributed across processors organised in a hierarchy of groups, which are
identified by different MPI communicator levels. The groups hierarchy is as
follows:
__ task groups
world __ images __ pools __/
\__ ortho group
"world" is the group of all processors (MPI_COMM_WORLD).
Processors can be divided into different "images", corresponding to a point in
configuration space (i.e., to a set of atomic positions). Such partitioning is
used when performing nudged elastic band (NEB), metadynamics, and Laio-
Parrinello simulations.
When k-point sampling is used for a given physical system, each image group
can be subpartitioned into "pools". k-points can distributed to pools.
Within each pool, reciprocal space basis set (plane waves) and real-space
vectors are distributed across processors. 3D FFT is used to transform
electronic wave functions from reciprocal to real space and vice versa.
In order to allow good parallelisation of the 3D FFT when the number of
processors exceeds the number of FFT planes, data can be redistributed to
"task groups" so that each group can process several wavefunctions at the same
time.
At each SCF iteration, the Hamiltonian is diagonalised using the iterative
Davidson algorithm based on standard linear algebra operations. The Hamiltonian
and other matrixes used in the iterative diagonalisation are distributed
block-like across the "ortho group", a subgroup of the pool processors
organised in a square 2D grid.
Images and pools are loosely coupled and processors communicate between
different images and pools only once in a while, whereas processors within
each pool are tightly coupled and communications are significant.
To control the number of images, pools, and task groups, the command line
argument: -nimage -npools -ntg can be used. The dimension of the ortho group
is set to the largest value compatible with the number of processors and with
the number of electronic states. In PWscf, a smaller value can be chosen with
the command line switch -ndiag. Default values are: nimage=1, npool=1, and
ntg=1; ndiag is chosen by the code.
When running the DEISA benchmark, the following command line arguments should
be used:
Dataset Cores Arguments
---------------------------------------------
AUSURF112 64 -npool 2
128 -npool 2
256 -npool 2
512 -npool 2
---------------------------------------------
AUSURF112 64 -npool 2 -ntg 2 -ndiag 25
taskgroup 128 -npool 2 -ntg 2 -ndiag 36
256 -npool 2 -ntg 4 -ndiag 49
512 -npool 2 -ntg 8 -ndiag 64
---------------------------------------------
4 Building
==========
To build QuantumESPRESSO, all you need is a working Fortran 95 compiler, a C
compiler and an MPI library preinstalled on your system. The compilation will
be done by JUBE.
In the directory QuantumESPRESSO/src, there are two files: compile.sh.in and
espresso-src.tar.gz. The file compile.sh.in is a template for the wrapper
script compile.sh. JUBE reads compile.sh.in and replaces all placeholders in
it with variables defined in compile.xml. The file espresso-src.tar.gz is a
gzipped tar package containing the QuantumESPRESSO source. You don't need to
touch either of these files; JUBE does everything for you automatically.
QuantumESPRESSO has been ported to a significant number of architectures
around the world. Nevertheless, it may happen that configuration or
compilation fail, especially if the code is being ported to a new
architecture. Configuration often fails because the configure script is not
able to guess the architecture being used. In this case you may try to suggest
the correct architecture in the file compile.xml with the substitution:
Here _MY_ARCH_ is the architecture you want to suggest. In case the configure
script does not find a working compiler, you can suggest it with:
Here _MY_MPI_F95_COMPILER_ is the compiler installed on your system.
QuantumESPRESSO is self-contained, which means that it compiles without any
auxiliary libraries. This is not, however, usually the best solution for the
optimal performance. It is recommended to use, at least, optimised BLAS and
LAPACK libraries. In case the configure script is not able to find them, you
may suggest them with:
The variables $blas_lib and $lapack_lib are defined in the file
DEISA_BENCH/platform/platform.xml.
In case the configure script works fine but compilation fails, you may try to
go to the directory where JUBE has run the compile step and modify by hand the
file "make.sys", which contains all macros and variables used by the command
make.
For more information on compilation, see the QuantumESPRESSO on-line
documentation at: http://www.quantum-espresso.org/wiki/index.php
5 Running the code
==================
JUBE generates automatically, for each benchmark, a job script and a directory
where the job is run. The file execute.xml describes how the job script is
set-up. Inputs for different benchmark cases are taken from the directory
QuantumESPRESSO/input, as described in the prepare.xml file. To select a given
benchmark case, edit the file bench-.xml and set active="1" in
the benchmark tag you are interested in.
To run the benchmarks with JUBE then simply execute:
../../bench/jube bench-.xml
To run the benchmarks manually, create a run directory on a filesystem seen by
all compute nodes with at least 100 GB of free space. Copy the files
Au.pbe-nd-van.UPF and ausurf.in from the directory input/AUSURF112 to the run
directory. Finally create a job script containing the following command:
./pw.x -input ausurf.in -npool 2 > ausurf.out
Here is the program used to load the executable on compute nodes,
for example "mpirun".
For more information on running the code, see the QuantumESPRESSO on-line
documentation at: http://www.quantum-espresso.org/wiki/index.php
6 Input and output data
=======================
In the directory input, there are several input datasets:
input/AUSURF3 Gold surface (3 atoms)
input/AUSURF112 Gold surface (112 atoms)
input/PSIWAT Protein-gold-water system (586 atoms)
Only the directory input/AUSURF112 is relevant for the DEISA benchmark. In
this directory there are two files:
Au.pbe-nd-van.UPF Gold pseudopotentials
ausurf.in QuantumESPRESSO keywords and system geometry
In the input directory there is also a simple Fortran 90 program,
input/surf_generator.f90, which can be used to generate gold surfaces of any
size. It may be useful to create larger benchmarks.
All benchmark-related information is printed in standard output.
Relevant information for timing is the time after the first iteration, and is
printed in a line that looks like:
Other useful timing informations are printed at the end of the output:
For verification instead the relevant information is the
value of the total energy after the first iteration that looks like:
Other files:
-----------