Results
This page contains results obtained by the DEISA benchmarking team on test platforms (see "Platforms used" for details).
See also
Archive
- Retired platforms
- Retired codes: DL_POLY, ECHAM5, PEPC-b1.5
- Result archive (to be released)
| BQCD | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-density |
24*24*24*48 50 iterations |
23.93 | 8.36 | 0.716 | CG solver time |
| 48*48*48*96 50 iterations |
307.92 | 102.41 | 0.752 | ||
| 48*48*48*96 100 iterations |
|||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-bandwidth |
24*24*24*48 50 iterations |
13.23 | 6.85 | 0.483 | CG solver time |
| 48*48*48*96 50 iterations |
232.27 | 60.17 | 0.965 | ||
| 48*48*48*96 100 iterations |
|||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip ST mode |
24*24*24*48 50 iterations |
||||
| 48*48*48*96 50 iterations |
|||||
| 48*48*48*96 100 iterations |
|||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip SMT mode |
24*24*24*48 50 iterations |
||||
| 48*48*48*96 50 iterations |
|||||
| 48*48*48*96 100 iterations |
|||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HECToR XT4 |
24*24*24*48 50 iterations |
||||
| 48*48*48*96 50 iterations |
|||||
| 48*48*48*96 100 iterations |
|||||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
24*24*24*48 50 iterations |
||||
| 48*48*48*96 50 iterations |
|||||
| 48*48*48*96 100 iterations |
|||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| Babel | 24*24*24*48 50 iterations |
33.93 | 14.21 | 0.597 | CG solver time |
| 48*48*48*96 50 iterations |
582.67 | 161.56 | 0.902 | ||
| 48*48*48*96 100 iterations |
1154.21 | 319.87 | 0.902 | ||
| CPMD | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 64 cores | 256 cores | 64 to 256 | |||
| HLRB II high-density |
H2O_128mol | 7.70 | 4.50 | 0.428 | <- time per timestep |
| H2O_256mol | 39.17 | 10.99 | 0.891 | ||
| H2O_128mol taskgroup | <- using 4 task group | ||||
| H2O_256mol taskgroup | <- using 4 task group | ||||
| H2O_384mol taskgroup | |||||
| H2O_512mol taskgroup | |||||
| 64 cores | 256 cores | 64 to 256 | |||
| HLRB II high-bandwidth |
H2O_128mol | 6.07 | 8.95 | 0.170 | <- time per timestep |
| H2O_256mol | 25.96 | 34.75 | 0.187 | ||
| H2O_128mol taskgroup | <- using 4 task group | ||||
| H2O_256mol taskgroup | <- using 4 task group | ||||
| H2O_384mol taskgroup | |||||
| H2O_512mol taskgroup | |||||
| 64 cores | 256 cores | 64 to 256 | |||
| vip ST mode |
H2O_128mol | ||||
| H2O_256mol | |||||
| H2O_128mol taskgroup | |||||
| H2O_256mol taskgroup | |||||
| H2O_384mol taskgroup | |||||
| H2O_512mol taskgroup | |||||
| 64 cores | 256 cores | 64 to 256 | |||
| vip SMT mode |
H2O_128mol | ||||
| H2O_256mol | |||||
| H2O_128mol taskgroup | |||||
| H2O_256mol taskgroup | |||||
| H2O_384mol taskgroup | |||||
| H2O_512mol taskgroup | |||||
| 64 cores | 256 cores | 64 to 256 | |||
| HECToR XT4 |
H2O_128mol | ||||
| H2O_256mol | |||||
| H2O_128mol taskgroup | |||||
| H2O_256mol taskgroup | |||||
| H2O_384mol taskgroup | |||||
| H2O_512mol taskgroup | |||||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
H2O_128mol | ||||
| H2O_256mol | |||||
| H2O_128mol taskgroup | |||||
| H2O_256mol taskgroup | |||||
| H2O_384mol taskgroup | |||||
| H2O_512mol taskgroup | |||||
| 64 cores | 256 cores | 64 to 256 | |||
| Babel | H2O_128mol | 7.64 | 4.39 | 0.435 | <-taskgroup=1 SMP MESH |
| H2O_256mol | 17.08 | <-taskgroup=1 DUAL MESH | |||
| H2O_128mol taskgroup | 6.30 | 2.80 | 0.562 | <-taskgroup=4 SMP MESH | |
| H2O_256mol taskgroup | 26.47 | 16.37 | 0.404 | <-taskgroup=4 DUAL MESH | |
| H2O_384mol taskgroup | |||||
| H2O_512mol taskgroup | |||||
| Fenfloss | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 64 cores | 512 cores | 64 to 512 | |||
| HLRB II high-density |
Cavity_224 | 227.57 | 43.07 | 0.660 | time for 10 iterations |
| 64 cores | 512 cores | 64 to 512 | |||
| HLRB II high-bandwidth |
Cavity_224 | 204.79 | 39.82 | 0.643 | time for 10 iterations |
| 64 cores | 512 cores | 64 to 512 | |||
| vip ST mode |
Cavity_224 | ||||
| 64 cores | 512 cores | 64 to 512 | |||
| vip SMT mode |
Cavity_224 | ||||
| 64 cores | 512 cores | 64 to 512 | |||
| HECToR XT4 |
Cavity_224 | 122.29 | 16.22 | 0.942 | time for 10 iterations |
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
Cavity_224 | 296.09 | 33.56 | 1.103 | time for 10 iterations |
| 64 cores | 512 cores | 64 to 512 | |||
| Babel | Cavity_224 | 288.64 | 33.93 | 1.063 | time for 10 iterations |
| GADGET | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-density |
small, no IO | ||||
| medium, no IO | 30.58 | time for iterations 2 + 3 | |||
| large, no IO | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-bandwidth |
small, no IO | ||||
| medium, no IO | 29.29 | 17.72 | 0.413 | time for iterations 2 + 3 | |
| large, no IO | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip ST mode |
small, no IO | ||||
| medium, no IO | |||||
| large, no IO | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip SMT mode |
small, no IO | ||||
| medium, no IO | |||||
| large, no IO | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HECToR XT4 |
small, no IO | 2.25 | time for iterations 2 + 3 | ||
| medium, no IO | 16.54 | time for iterations 2 + 3 | |||
| large, no IO | 430.10 | 102.86 | 1.045 | time for iterations 2 + 3 | |
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
small, no IO | ||||
| medium, no IO | time for iterations 2 + 3 | ||||
| large, no IO | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| Babel | small, no IO | ||||
| medium, no IO | 77.27 | 21.65 | 0.819 | time for iterations 2 + 3 | |
| large, no IO | |||||
| GENE | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-density |
strong_512 | 8.22 | 3.58 | 0.574 | time per timestep |
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-bandwidth |
strong_512 | 6.98 | time per timestep | ||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip ST mode |
strong_512 | ||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip SMT mode |
strong_512 | ||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HECToR XT4 |
strong_512 | time per timestep | |||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
strong_512 | time per timestep | |||
| 512 cores | 2048 cores | 512 to 2048 | |||
| Babel | strong_512 | 10.30 | time per timestep | ||
| IFS | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-density |
T159 | ||||
| T799 | |||||
| T1279 | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-bandwidth |
T159 | ||||
| T799 | |||||
| T1279 | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip ST mode |
T159 | ||||
| T799 | |||||
| T1279 | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip SMT mode |
T159 | ||||
| T799 | |||||
| T1279 | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HECToR XT4 |
T159 | ||||
| T799 | |||||
| T1279 | |||||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
T159 | ||||
| T799 | |||||
| T1279 | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| Babel | T159 | 2.33 | 1.22 | 0.477 | SMP MESH |
| T799 | 62.99 | 18.01 | 0.874 | SMP TORUS(4096) rest MESH (other timing:7.70 on 8192 cores) | |
| T1279 | 65.16 | SMP TORUS(4096) rest MESH (other timing:22.60 on 8192 cores and 21.95 on 16384 cores) | |||
| IQCS | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 256 cores | 1024 cores | 256 to 1024 | |||
| HLRB II high-density |
28 Qubits | ||||
| 29 Qubits | |||||
| 30 Qubits | |||||
| 31 Qubits | |||||
| 32 Qubits/1 | |||||
| 32 Qubits/2 | |||||
| 32 Qubits/3 | 31.93 | total wallclock time | |||
| 33 Qubits | |||||
| 34 Qubits | |||||
| 256 cores | 1024 cores | 256 to 1024 | |||
| HLRB II high-bandwidth |
28 Qubits | ||||
| 29 Qubits | |||||
| 30 Qubits | |||||
| 31 Qubits | |||||
| 32 Qubits/1 | |||||
| 32 Qubits/2 | |||||
| 32 Qubits/3 | 19.31 | total wallclock time | |||
| 33 Qubits | |||||
| 34 Qubits | |||||
| 256 cores | 1024 cores | 256 to 1024 | |||
| vip ST mode |
28 Qubits | ||||
| 29 Qubits | |||||
| 30 Qubits | |||||
| 31 Qubits | |||||
| 32 Qubits/1 | |||||
| 32 Qubits/2 | |||||
| 32 Qubits/3 | |||||
| 33 Qubits | |||||
| 34 Qubits | |||||
| 256 cores | 1024 cores | 256 to 1024 | |||
| vip SMT mode |
28 Qubits | ||||
| 29 Qubits | |||||
| 30 Qubits | |||||
| 31 Qubits | |||||
| 32 Qubits/1 | |||||
| 32 Qubits/2 | |||||
| 32 Qubits/3 | |||||
| 33 Qubits | |||||
| 34 Qubits | |||||
| 256 cores | 1024 cores | 256 to 1024 | |||
| HECToR XT4 |
28 Qubits | ||||
| 29 Qubits | |||||
| 30 Qubits | |||||
| 31 Qubits | |||||
| 32 Qubits/1 | |||||
| 32 Qubits/2 | |||||
| 32 Qubits/3 | |||||
| 33 Qubits | |||||
| 34 Qubits | |||||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
28 Qubits | 18.73 | 2.70 | 0.867 | total wallclock time |
| 29 Qubits | 37.51 | 5.30 | 0.885 | ||
| 30 Qubits | 75.46 | 10.51 | 0.897 | ||
| 31 Qubits | 150.87 | 21.00 | 0.898 | ||
| 32 Qubits/1 | |||||
| 32 Qubits/2 | |||||
| 32 Qubits/3 | |||||
| 33 Qubits | |||||
| 34 Qubits | |||||
| 256 cores | 1024 cores | 256 to 1024 | |||
| Babel | 28 Qubits | ||||
| 29 Qubits | |||||
| 30 Qubits | |||||
| 31 Qubits | |||||
| 32 Qubits/1 | 69.68 | 28.98 | 0.601 | VN MESH (256-512-1024) TORUS (2048-4096) | |
| 32 Qubits/2 | 52.43 | 16.01 | 0.819 | DUAL MESH (128-256-512) TORUS (1024-2048) | |
| 32 Qubits/3 | 38.91 | 11.89 | 0.818 | SMP MESH (64-128-256) TORUS (512-1024) | |
| 33 Qubits | 78.60 | 24.09 | 0.816 | SMP MESH (128-256) TORUS (512-1024) | |
| 34 Qubits | 159.39 | 48.38 | 0.824 | SMP MESH (256) TORUS (512-1024) | |
| NAMD | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 64 cores | 256 cores | 64 to 256 | |||
| HLRB II high-density |
APOA1 | 23.01 | 16.57 | 0.347 | total wallclock time |
| 1CQY | 1342.25 | 1139.04 | 0.295 | ||
| 1GND | 2574.28 | 1623.71 | 0.396 | ||
| 4f2hc | 7998.63 | 3167.11 | 0.631 | ||
| 64 cores | 256 cores | 64 to 256 | |||
| HLRB II high-bandwidth |
APOA1 | 26.46 | 17.01 | 0.389 | total wallclock time |
| 1CQY | 1390.06 | 1195.13 | 0.291 | ||
| 1GND | 2634.59 | 1679.09 | 0.392 | ||
| 4f2hc | 7746.23 | 3036.08 | 0.638 | ||
| 64 cores | 256 cores | 64 to 256 | |||
| vip ST mode |
APOA1 | ||||
| 1CQY | |||||
| 1GND | |||||
| 4f2hc | |||||
| 64 cores | 256 cores | 64 to 256 | |||
| vip SMT mode |
APOA1 | ||||
| 1CQY | |||||
| 1GND | |||||
| 4f2hc | |||||
| 64 cores | 256 cores | 64 to 256 | |||
| HECToR XT4 |
APOA1 | ||||
| 1CQY | |||||
| 1GND | |||||
| 4f2hc | |||||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
APOA1 | ||||
| 1CQY | |||||
| 1GND | |||||
| 4f2hc | |||||
| 64 cores | 256 cores | 64 to 256 | |||
| Babel | APOA1 | 109.43 | 48.09 | 0.569 | VN, MESH |
| 1CQY | 4214.39 | 2176.96 | 0.484 | DUAL,MESH | |
| 1GND | 8373.47 | 3493.88 | 0.599 | DUAL,MESH | |
| 4f2hc | 14059.56 | DUAL,MESH | |||
| NEMO | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 512 cores | 1024 cores | 512 to 1024 | |||
| HLRB II high-density |
GYRE.25, no IO | 219.00 | total wallclock time | ||
| GYRE.25, IO | 256.00 | ||||
| GYRE.50, no IO | 482.00 | 752.00 | 0.320 | ||
| GYRE.50, IO | 640.00 | 960.00 | 0.333 | ||
| GYRE.150, no IO | 1963.00 | ||||
| GYRE.150, IO | 2021.00 | ||||
| 512 cores | 1024 cores | 512 to 1024 | |||
| HLRB II high-bandwidth |
GYRE.25, no IO | 212.00 | total wallclock time | ||
| GYRE.25, IO | 355.00 | ||||
| GYRE.50, no IO | 414.00 | 807.00 | 0.257 | ||
| GYRE.50, IO | 1034.00 | 807.00 | 0.641 | ||
| GYRE.150, no IO | 1079.00 | ||||
| GYRE.150, IO | 1715.00 | ||||
| 512 cores | 1024 cores | 512 to 1024 | |||
| vip ST mode |
GYRE.25, no IO | ||||
| GYRE.25, IO | |||||
| GYRE.50, no IO | |||||
| GYRE.50, IO | |||||
| GYRE.150, no IO | |||||
| GYRE.150, IO | |||||
| 512 cores | 1024 cores | 512 to 1024 | |||
| vip SMT mode |
GYRE.25, no IO | ||||
| GYRE.25, IO | |||||
| GYRE.50, no IO | |||||
| GYRE.50, IO | |||||
| GYRE.150, no IO | |||||
| GYRE.150, IO | |||||
| 512 cores | 1024 cores | 512 to 1024 | |||
| HECToR XT4 |
GYRE.25, no IO | ||||
| GYRE.25, IO | |||||
| GYRE.50, no IO | |||||
| GYRE.50, IO | |||||
| GYRE.150, no IO | |||||
| GYRE.150, IO | |||||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
GYRE.25, no IO | ||||
| GYRE.25, IO | |||||
| GYRE.50, no IO | |||||
| GYRE.50, IO | |||||
| GYRE.150, no IO | |||||
| GYRE.150, IO | |||||
| 512 cores | 1024 cores | 512 to 1024 | |||
| Babel | GYRE.25, no IO | 445.00 | 401.00 | 0.555 | VN |
| GYRE.25, IO | 738.00 | 561.00 | 0.658 | VN | |
| GYRE.50, no IO | 1625.00 | 1035.00 | 0.785 | VN (256, 512, 1024, 2048), DUAL (128) | |
| GYRE.50, IO | 2290.00 | 1571.00 | 0.729 | VN (256, 512, 1024, 2048), DUAL (128) | |
| GYRE.150, no IO | 4348.00 | 2319.00 | 0.937 | SMP (512), DUAL (1024, 2048), VN(4096) | |
| GYRE.150, IO | 4592.00 | 2577.00 | 0.891 | SMP (512), DUAL (1024, 2048), VN(4096) | |
| PEPC | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 64 cores | 512 cores | 64 to 512 | |||
| HLRB II high-density |
Sphere 5M | ||||
| Sphere 15M | |||||
| Sphere 25M | |||||
| 64 cores | 512 cores | 64 to 512 | |||
| HLRB II high-bandwidth |
Sphere 5M | ||||
| Sphere 15M | |||||
| Sphere 25M | |||||
| 64 cores | 512 cores | 64 to 512 | |||
| vip ST mode |
Sphere 5M | total time | |||
| Sphere 15M | |||||
| Sphere 25M | |||||
| 64 cores | 512 cores | 64 to 512 | |||
| vip SMT mode |
Sphere 5M | total time | |||
| Sphere 15M | |||||
| Sphere 25M | |||||
| 64 cores | 512 cores | 64 to 512 | |||
| HECToR XT4 |
Sphere 5M | ||||
| Sphere 15M | |||||
| Sphere 25M | |||||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
Sphere 5M | ||||
| Sphere 15M | |||||
| Sphere 25M | |||||
| 64 cores | 512 cores | 64 to 512 | |||
| Babel | Sphere 5M | ||||
| Sphere 15M | |||||
| Sphere 25M | |||||
| QuantumESPRESSO | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 64 cores | 256 cores | 64 to 256 | |||
| HLRB II high-density |
AUSURF112 | 488.38 | 321.07 | 0.380 | total wallclock time |
| AUSURF112 taskgroup (2 and 4 task group) |
484.28 | 303.73 | 0.399 | ||
| PSIWAT (2 and 4 task group) |
1886.32 | 840.29 | 0.561 | ||
| 64 cores | 256 cores | 64 to 256 | |||
| HLRB II high-bandwidth |
AUSURF112 | 492.81 | 314.92 | 0.391 | total wallclock time |
| AUSURF112 taskgroup (2 and 4 task group) |
478.43 | 288.63 | 0.414 | ||
| PSIWAT (2 and 4 task group) |
1794 | 836.07 | 0.536 | ||
| 64 cores | 256 cores | 64 to 256 | |||
| vip ST mode |
AUSURF112 | ||||
| AUSURF112 taskgroup (2 and 4 task group) |
|||||
| PSIWAT (2 and 4 task group) |
|||||
| 64 cores | 256 cores | 64 to 256 | |||
| vip SMT mode |
AUSURF112 | ||||
| AUSURF112 taskgroup (2 and 4 task group) |
|||||
| PSIWAT (2 and 4 task group) |
|||||
| 64 cores | 256 cores | 64 to 256 | |||
| HECToR XT4 |
AUSURF112 | 599.39 | 304.38 | 0.492 | |
| AUSURF112 taskgroup (2 and 4 task group) |
605.3 | 256.28 | 0.590 | ||
| PSIWAT (2 and 4 task group) |
2094.34 | 771.64 | 0.679 | ||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
AUSURF112 | 2028.12 | 428.72 | 0.591 | total wallclock time |
| AUSURF112 taskgroup (2 and 4 task group) |
428.58 | ||||
| PSIWAT (2 and 4 task group) |
1045.29 | ||||
| 64 cores | 256 cores | 64 to 256 | |||
| Babel | AUSURF112 | 599.44 | VN, PREFER_TORUS, XYZT | ||
| AUSURF112 taskgroup (2 and 4 task group) |
644.83 | VN, PREFER_TORUS, XYZT | |||
| PSIWAT (2 and 4 task group) |
1827.19 | DUAL (256), VN (512), PREFER_TORUS, XYZT | |||
| RAMSES | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-density |
Sedov3D | 62.00 | total wallclock time | ||
| AMR | 550.00 | 687.00 | 0.200 | ||
| Sedov3D 1024 | 440.00 | 115.00 | 0.957 | ||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-bandwidth |
Sedov3D | 41.00 | total wallclock time | ||
| AMR | 495.00 | 666.00 | 0.186 | ||
| Sedov3D 1024 | 271.00 | 79.00 | 0.858 | ||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip ST mode |
Sedov3D | ||||
| AMR | |||||
| Sedov3D 1024 | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip SMT mode |
Sedov3D | ||||
| AMR | |||||
| Sedov3D 1024 | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HECToR XT4 |
Sedov3D | ||||
| AMR | |||||
| Sedov3D 1024 | |||||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
Sedov3D | ||||
| AMR | |||||
| Sedov3D 1024 | |||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| Babel | Sedov3D | 85.60 | 21.80 | 0.982 | VN, TXYZ, MESH(64,128,256,512,1024), TORUS(2048,4096) |
| AMR | 1349.00 | 960.00 | 0.351 | SMP(64), DUAL(128,256), VN(512, 1024,2048, 4096), TXYZ, MESH(64, 128,256,512,1024), TORUS(2048,4096) | |
| Sedov3D 1024 | VN, TXYZ, TORUS (other timings : 1692 on 8192 cores, 848 on 16384 cores) | ||||
| SU3_AHiggs | Datasets | Time (in seconds) | Efficiency | Comment | |
|---|---|---|---|---|---|
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-density |
323 lattice, 10 000 iterations |
36.00 | total wallclock time | ||
| 2563 lattice, 100 iterations |
184.50 | ||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HLRB II high-bandwidth |
323 lattice, 10 000 iterations |
36.30 | 22.50 | 0.403 | total wallclock time |
| 2563 lattice, 100 iterations |
125.50 | 26.70 | 1.175 | ||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip ST mode |
2563 lattice, 100 iterations |
||||
| 2563 lattice, 100 iterations |
|||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| vip SMT mode |
2563 lattice, 100 iterations |
||||
| 2563 lattice, 100 iterations |
|||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| HECToR XT4 |
323 lattice, 10000 iterations |
54.30 | |||
| 2563 lattice, 100 iterations |
78.40 | 19.20 | 1.021 | ||
| 8 cores | 64 cores | 8 to 64 | |||
| HECToR X2 |
323 lattice, 10000 iterations |
||||
| 2563 lattice, 100 iterations |
|||||
| 512 cores | 2048 cores | 512 to 2048 | |||
| Babel | 323 lattice, 10000 iterations |
154.90 | VN, PREFER_TORUS, TXYZ | ||
| 2563 lattice, 100 iterations |
242.60 | 107.80 | 0.563 | VN, PREFER_TORUS, TXYZ | |
Compiler info and libraries
| BQCD | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | ifort 9.1 + icc 9.1 | -O2 -openmp | |
| vip | mpxlf90_r, mpcc_r | -qsmp -qnosave -qsuffix=f=f90 -q64 (for mpcc_r: -qsmp -qnosave) [add:" -qsmp=omp " for hybrid BQCD] | |
| HECToR XT4 |
|||
| HECToR X2 |
|||
| Babel | mpixlf90_r, mpixlc_r | -qarch=450 -qtune=450 -O3 -qstrict |
| CPMD | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | ifort 9.1 | -O3 | mkl |
| vip | mpxlf_r | -q64 -qtune=pwr6 -qarch=pwr6 -O3 -qrealsize=8 -qsuffix=f=f90 | |
| HECToR XT4 |
|||
| HECToR X2 |
|||
| Babel | bgxlf_r | -O3 -w -qsmp=omp -qnosave -qarch=450 -c | lapack, essl |
| Fenfloss | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | Ifort 9.1 | -O3 -ipo | |
| vip | mpxlf90_r | -q64 -O5 | |
| HECToR XT4 |
pgf90/7.1.4 | -fastsse -Msmart -Mipa=fast | |
| HECToR X2 |
ftn (x1x2-pe/6.0.0.1) | -O3 | |
| Babel | mpixlf90_r | -O5 |
| GADGET | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | icc 9.1 | -O2 | fftw2, gsl |
| vip | mpcc_r | -q64 -qtune=pwr6 -qarch=pwr6 -O3 -qstrict -qcpluscmt -qipa | fftw2, gsl |
| HECToR XT4 |
pgcc | -fastsse -O3 -Mipa=fast,inline | fftw2, gsl, hdf5 |
| HECToR X2 |
|||
| Babel | mpixlc_r | -O3 -qstrict -qarch=450 -qtune=450 -qcpluscm | fftw2, gsl, hdf5 |
| GENE | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | ifort 9.1 | -O3 -ip -ftz -align -fno-alias -r8 | blas, fftw3 |
| vip | mpxlf90_r | -q64 -qrealsize=8 -qtune=pwr6 -qarch=pwr6 -O3 -qsuffix=cpp=F90 -WF,-DWITHESSL,-DDOUBLE_PREC | essl |
| HECToR XT4 |
|||
| HECToR X2 |
|||
| Babel | bgxlf90_r | -qtune=450 -qarch=450d | -lesslbg |
| IFS | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | |||
| vip | mpxlf90_r | -qextname -q64 -qarch=pwr6 -O3 -qstrict -qutodbl=dbl4 -qfree=F90 -qsuffix=cpp=F90 -qsmp=omp -qsource -NS32648 -WF,-DRS6K -WF,-DBLAS -WF,POWER6 | essl, mass massvp6 |
| mpcc_r | -O3 -qarch=pwr6 -q64 -qmaxmem=-1 -DRS6K -DINTERCEPT_ALLOC -D_ABI64 -DFORTRAN_WITH_UNDERSCORE | ||
| HECToR XT4 |
|||
| HECToR X2 |
|||
| Babel | mpixlf90_r | -O2 -qstrict -qarch=450 -qmaxmem=-1 -qextname -qsource -qautodbl=dbl4 -qfree=F90 -qsmp=omp -qsuffix=cpp=F90 -WF,-DLINUX -WF,-DBLAS -WF,-DBLUEGENE | esslbg, mass, massv, fmpich_.cnk |
| mpixlc_r | -O2 -qarch=450 -qmaxmem=-1 -DLINUX -DBLUEGENE -DINTERCEPT_ALLOC -DFORTRAN_WITH_UNDERSCORE |
| IQCS | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | |||
| vip | mpxlf90_r | -O5 -q64 -qtune=pwr6 -qarch=pwr6 -qxlf90=autodealloc | |
| HECToR XT4 |
pgf90/7.1.4 | -O3 -DDYNALLOC | |
| HECToR X2 |
ftn | -f free -N 255 -h cpu=cray-x2 -O3 -e Z -DDYNALLOC | |
| Babel | mpixlf90_r | -O5 -qnostrict -qarch=450 -qtune=450 -WF,-DDYNALLOC |
| NAMD | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | mpicc + icc | for namd: CXX = icpc -D_IA64 -I/usr/local/gnu/include CXXOPTS = -static-libcxa -O2 | fftw2, tcl |
| for charm++: ./build charm++ mpi-linux-ia64 -nobs -DCMK_OPTIMIZE=1 | |||
| vip | xlc_r | for namd: -O4 -qinlglue -qarch=pwr6 -qtune=pwr6 | |
| mpCC_r | for charm: -O3 -qstrict -Q ./build charm++ mpi-sp | ||
| HECToR XT4 |
|||
| HECToR X2 |
|||
| Babel | bgxlC_r | for namd: -O3 -qhot -qarch=450d -qtune=450 | fftw |
| for charm: ./build charm++ mpi-bluegenep xlc --no-shared -j4 -O3 -qarch=450d -DCMK_OPTIMIZE=1 |
| NEMO | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | ifort 9.1 | -O2 -r8 | netcdf |
| vip | mpxlf90_r | -q64 -qtune=pwr6 -qarch=pwr6 -O3 -qrealsize=8 -qsave -qsuffix=cpp=F90 | netcdf |
| HECToR XT4 |
|||
| HECToR X2 |
|||
| Babel | mpixlf90_r | -O3 -qstrict -qtune=450 -qarch=450 | netcdf |
| PEPC | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | |||
| vip | mpxlf90_r | -qtune=pwr6 -qarch=pwr6 -q64 -O4 -qipa=level=1 -qipa=inline=key2addr | |
| HECToR XT4 |
|||
| HECToR X2 |
|||
| Babel |
| QuantumESPRESSO | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | intel/10.1 | -O2 -assume byterecl (link flags: -i-static -openmp) | mkl/9.1 |
| vip | mpxlf90_r | -q64 -O3 -qarch=pwr6 -qtune=pwr6 | essl, mass |
| HECToR XT4 |
pgi/8.0.2 | -fast -r8 | acml/4.0.1a |
| HECToR X2 |
ftn | -O3 | libsci/6.0.0.3.chip21 |
| Babel | mpixlf90_r | -O4 -qarch=450d -qtune=450 -qdpc -qalias=nointptr -Q | essl |
| RAMSES | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | ifort 9.1 | -O3 -ipo | |
| vip | mpxlf90_r | -O4 -qnostrict -qhot -qarch=pwr6 -qtune=pwr6 | |
| HECToR XT4 |
|||
| HECToR X2 |
|||
| Babel | mpixlf90_r | -O4 -qstrict -qhot -qtune=450 -qarch=450d |
| SU3_AHiggs | Compiler | Compiler options | Libraries |
|---|---|---|---|
| HLRB II | intel/9.1 | -O3 -ip | |
| vip | mpcc_r | -q64 -O5 | |
| HECToR XT4 |
pathscale/3.1 | -Ofast -march=barcelona | |
| HECToR X2 |
cc | -O3 | |
| Babel | mpixlc_r | -O3 -qarch=450d -qtune=450 -qipa=level=2 | mass |


