This is the README file for the QuantumESPRESSO application benchmark, distributed with the DEISA Benchmark Suite: http://www.deisa.eu/science/benchmarking/ Last modified by the DEISA Benchmark Team on 2008-09-09. ---------------------- QuantumESPRESSO readme ---------------------- Contents ------- 1 General description 2 Code structure 3 Parallelisation 4 Building 5 Running the code 6 Input and output data 1 General description ===================== Quantum ESPRESSO is an integrated suite of computer codes for electronic- structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials (both norm- conserving and ultrasoft). Quantum ESPRESSO stands for opEn Source Package for Research in Electronic Structure, Simulation, and Optimization. It is freely available to researchers around the world under the terms of the GNU General Public License. Quantum ESPRESSO builds onto newly-restructured electronic-structure codes (PWscf, PHONON, CP90, FPMD, Wannier) that have been developed and tested by some of the original authors of novel electronic-structure algorithms - from Car-Parrinello molecular dynamics to density-functional perturbation theory - and applied in the last twenty years by some of the leading materials modeling groups worldwide. Innovation and efficiency is still our main focus. Quantum ESPRESSO is evolving towards a distribution of independent and inter- operable codes in the spirit of an open-source project. Researchers active in the field of electronic-structure calculations are encouraged to participate in the project by contributing their own codes or by implementing their own ideas into existing codes. Quantum ESPRESSO is an initiative of the DEMOCRITOS National Simulation Center (Trieste) and of its partners, in collaboration with the CINECA National Supercomputing Center in Bologna, the Ecole Polytechnique Federale de Lausanne, Princeton University, and the Massachusetts Institute of Technology. Courses on modern electronic-structure theory with hands-on tutorials on the Quantum ESPRESSO codes are offered on a regular basis in developed as well as developing countries, in collaboration with the Abdus Salam International Centre for Theoretical Physics in Trieste. From the codes available in the QuantumESPRESSO package the DEISA benchmark uses PWscf, and the version of QuantumESPRESSO used is 4.0. For more information on QuantumESPRESSO, see the code's website at http://www.quantum-espresso.com/. 2 Code structure ================ The DEISA benchmark "QuantumESPRESSO" uses the PWscf code, included in the QuantumESPRESSO package. PWscf computes ground state total energy of a system of atoms and can perform different kind of ionic degrees of freedom phase space sampling and energy minimisation such as Born-Oppenheimer Molecular Dynamics or BFGS energy minimisation. The schema of the computations implemented by the code is as follows: read input or restart file initialise the computation and distribute ions between MPI tasks start ion cycle start iterative energy minimisation (SCF iteration) compute potentials from the charge density diagonalise the Hamiltonian to find wave functions compute the new charge density from the wave functions end iterative energy minimisation update ion's position update the geometry of the system end ion cycle dump relevant physical quantities on the disk 3 Parallelisation ================= QuantumESPRESSO codes are parallelised purely with MPI. Data structures are distributed across processors organised in a hierarchy of groups, which are identified by different MPI communicator levels. The groups hierarchy is as follows: __ task groups world __ images __ pools __/ \__ ortho group "world" is the group of all processors (MPI_COMM_WORLD). Processors can be divided into different "images", corresponding to a point in configuration space (i.e., to a set of atomic positions). Such partitioning is used when performing nudged elastic band (NEB), metadynamics, and Laio- Parrinello simulations. When k-point sampling is used for a given physical system, each image group can be subpartitioned into "pools". k-points can distributed to pools. Within each pool, reciprocal space basis set (plane waves) and real-space vectors are distributed across processors. 3D FFT is used to transform electronic wave functions from reciprocal to real space and vice versa. In order to allow good parallelisation of the 3D FFT when the number of processors exceeds the number of FFT planes, data can be redistributed to "task groups" so that each group can process several wavefunctions at the same time. At each SCF iteration, the Hamiltonian is diagonalised using the iterative Davidson algorithm based on standard linear algebra operations. The Hamiltonian and other matrixes used in the iterative diagonalisation are distributed block-like across the "ortho group", a subgroup of the pool processors organised in a square 2D grid. Images and pools are loosely coupled and processors communicate between different images and pools only once in a while, whereas processors within each pool are tightly coupled and communications are significant. To control the number of images, pools, and task groups, the command line argument: -nimage -npools -ntg can be used. The dimension of the ortho group is set to the largest value compatible with the number of processors and with the number of electronic states. In PWscf, a smaller value can be chosen with the command line switch -ndiag. Default values are: nimage=1, npool=1, and ntg=1; ndiag is chosen by the code. When running the DEISA benchmark, the following command line arguments should be used: Dataset Cores Arguments --------------------------------------------- AUSURF112 64 -npool 2 128 -npool 2 256 -npool 2 512 -npool 2 --------------------------------------------- AUSURF112 64 -npool 2 -ntg 2 -ndiag 25 taskgroup 128 -npool 2 -ntg 2 -ndiag 36 256 -npool 2 -ntg 4 -ndiag 49 512 -npool 2 -ntg 8 -ndiag 64 --------------------------------------------- 4 Building ========== To build QuantumESPRESSO, all you need is a working Fortran 95 compiler, a C compiler and an MPI library preinstalled on your system. The compilation will be done by JUBE. In the directory QuantumESPRESSO/src, there are two files: compile.sh.in and espresso-src.tar.gz. The file compile.sh.in is a template for the wrapper script compile.sh. JUBE reads compile.sh.in and replaces all placeholders in it with variables defined in compile.xml. The file espresso-src.tar.gz is a gzipped tar package containing the QuantumESPRESSO source. You don't need to touch either of these files; JUBE does everything for you automatically. QuantumESPRESSO has been ported to a significant number of architectures around the world. Nevertheless, it may happen that configuration or compilation fail, especially if the code is being ported to a new architecture. Configuration often fails because the configure script is not able to guess the architecture being used. In this case you may try to suggest the correct architecture in the file compile.xml with the substitution: Here _MY_ARCH_ is the architecture you want to suggest. In case the configure script does not find a working compiler, you can suggest it with: Here _MY_MPI_F95_COMPILER_ is the compiler installed on your system. QuantumESPRESSO is self-contained, which means that it compiles without any auxiliary libraries. This is not, however, usually the best solution for the optimal performance. It is recommended to use, at least, optimised BLAS and LAPACK libraries. In case the configure script is not able to find them, you may suggest them with: The variables $blas_lib and $lapack_lib are defined in the file DEISA_BENCH/platform/platform.xml. In case the configure script works fine but compilation fails, you may try to go to the directory where JUBE has run the compile step and modify by hand the file "make.sys", which contains all macros and variables used by the command make. For more information on compilation, see the QuantumESPRESSO on-line documentation at: http://www.quantum-espresso.org/wiki/index.php 5 Running the code ================== JUBE generates automatically, for each benchmark, a job script and a directory where the job is run. The file execute.xml describes how the job script is set-up. Inputs for different benchmark cases are taken from the directory QuantumESPRESSO/input, as described in the prepare.xml file. To select a given benchmark case, edit the file bench-.xml and set active="1" in the benchmark tag you are interested in. To run the benchmarks with JUBE then simply execute: ../../bench/jube bench-.xml To run the benchmarks manually, create a run directory on a filesystem seen by all compute nodes with at least 100 GB of free space. Copy the files Au.pbe-nd-van.UPF and ausurf.in from the directory input/AUSURF112 to the run directory. Finally create a job script containing the following command: ./pw.x -input ausurf.in -npool 2 > ausurf.out Here is the program used to load the executable on compute nodes, for example "mpirun". For more information on running the code, see the QuantumESPRESSO on-line documentation at: http://www.quantum-espresso.org/wiki/index.php 6 Input and output data ======================= In the directory input, there are several input datasets: input/AUSURF3 Gold surface (3 atoms) input/AUSURF112 Gold surface (112 atoms) input/PSIWAT Protein-gold-water system (586 atoms) Only the directory input/AUSURF112 is relevant for the DEISA benchmark. In this directory there are two files: Au.pbe-nd-van.UPF Gold pseudopotentials ausurf.in QuantumESPRESSO keywords and system geometry In the input directory there is also a simple Fortran 90 program, input/surf_generator.f90, which can be used to generate gold surfaces of any size. It may be useful to create larger benchmarks. All benchmark-related information is printed in standard output. Relevant information for timing is the time after the first iteration, and is printed in a line that looks like: Other useful timing informations are printed at the end of the output: For verification instead the relevant information is the value of the total energy after the first iteration that looks like: Other files: -----------