Personal tools
You are here: Home Science & Projects benchmarking Results

Results

This page contains results obtained by the DEISA benchmarking team on test platforms (see "Platforms used" for details).

See also

 

BQCD Datasets Time (in seconds) Efficiency Comment


512 cores 2048 cores 512 to 2048
SGI Altix HLRB2
high-density
24*24*24*48
50 iterations
23.93 8.36 0.716
48*48*48*96
50 iterations
307.92 102.41 0.752
48*48*48*96
100 iterations






512 cores 2048 cores 512 to 2048
SGI Altix HLRB2
high-bandwidth
24*24*24*48
50 iterations
13.23 6.85 0.483
48*48*48*96
50 iterations
232.27 60.17 0.965
48*48*48*96
100 iterations






512 cores 2048 cores 512 to 2048
IBM P6 vip
ST mode
24*24*24*48
50 iterations
13.37 6.67 0.501
48*48*48*96
50 iterations
235.52 51.55 1.142
48*48*48*96
100 iterations
465.31 211.65 0.550


512 cores 2048 cores 512 to 2048
IBM P6 vip
SMT mode
24*24*24*48
50 iterations
8.49 3.81 0.557
48*48*48*96
50 iterations
136.07 27.29 1.247
48*48*48*96
100 iterations
270.06 62.88 1.074


512 cores 2048 cores 512 to 2048
Cray XT4 HECToR 24*24*24*48
50 iterations
29.41 8.70 0.845
48*48*48*96
50 iterations
438.75 120.76 0.908
48*48*48*96
100 iterations






16 cores 64 cores 16 to 64
Cray X2 HECToR 24*24*24*48
50 iterations




48*48*48*96
50 iterations




48*48*48*96
100 iterations






512 cores 2048 cores 512 to 2048
IBM BGP Babel 24*24*24*48
50 iterations
39.61 14.21 0.697
48*48*48*96
50 iterations
582.67 161.56 0.902
48*48*48*96
100 iterations
1154.21 319.87 0.902


512 cores 2048 cores 512 to 2048
IBM PowerPC MareNostrum 24*24*24*48
50 iterations
41.02


48*48*48*96
50 iterations
710.35


48*48*48*96
100 iterations






512 cores 2048 cores 512 to 2048
IBM BGP JUGENE 24*24*24*48
50 iterations
37.34 10.26 0.910
48*48*48*96
50 iterations
564.61 140.71 1.003
48*48*48*96
100 iterations
1118.49 278.8 1.003


512 cores 2048 cores 512 to 2048
IBM SP6 CINECA
ST mode
24*24*24*48
50 iterations
6.24 3.36 0.464
48*48*48*96
50 iterations
125.29 27.7 1.131
48*48*48*96
100 iterations






512 cores 2048 cores 512 to 2048
IBM SP6 CINECA
SMT mode
24*24*24*48
50 iterations
9.8


48*48*48*96
50 iterations
281.73


48*48*48*96
100 iterations






512 cores 2048 cores 512 to 2048
Intel Nehalem JuRoPA 24*24*24*48
50 iterations
8.37 4.22 0.496
48*48*48*96
50 iterations
174.34 39.95 1.091
48*48*48*96
100 iterations






16 cores 64 cores 16 to 64
NEC SX9 HLRS 24*24*24*48
50 iterations




48*48*48*96
50 iterations




48*48*48*96
100 iterations






512 cores 2048 cores 512 to 2048
Dell PowerEdge Ekman 24*24*24*48
50 iterations
19.76 5.37 0.920 mpi_paffinity_alone=1
48*48*48*96
50 iterations
405.8 99.93 1.015 mpi_paffinity_alone=1
48*48*48*96
100 iterations






516 cores 2052 cores 516 to 2052
Cray XT5 Rosa 24*24*24*48
50 iterations
34.04 23.67 0.362
48*48*48*96
50 iterations
498.75 144.55 0.868
48*48*48*96
100 iterations






528 cores 2064 cores 528 to 2064
Cray XE6 HECToR 24*24*24*48
50 iterations
27.11 7.05 0.984
48*48*48*96
50 iterations
396.69 110.56 0.918
48*48*48*96
100 iterations




 

CPMD Datasets Time (in seconds) Efficiency Comment


64 cores 256 cores 64 to 256
SGI Altix HLRB2
high-density
H2O_128mol



H2O_256mol



H2O_128mol taskgroup



H2O_256mol taskgroup



H2O_384mol taskgroup



H2O_512mol taskgroup





64 cores 256 cores 64 to 256
SGI Altix HLRB2
high-bandwidth
H2O_128mol



H2O_256mol



H2O_128mol taskgroup



H2O_256mol taskgroup



H2O_384mol taskgroup



H2O_512mol taskgroup





64 cores 256 cores 64 to 256
IBM P6 vip
ST mode
H2O_128mol 3.34 2.02 0.413
H2O_256mol 14.61 5.03 0.726
H2O_128mol taskgroup 4.60 1.45 0.793
H2O_256mol taskgroup 19.85 4.94 1.005
H2O_384mol taskgroup 51.82 13.93 0.930
H2O_512mol taskgroup 102.66 27.10 0.947


64 cores 256 cores 64 to 256
IBM P6 vip
SMT mode
H2O_128mol 3.38 4.22 0.200
H2O_256mol 18.87


H2O_128mol taskgroup 4.38 2.01 0.545
H2O_256mol taskgroup 53.82


H2O_384mol taskgroup 53.82


H2O_512mol taskgroup 102.90




64 cores 256 cores 64 to 256
Cray XT4 HECToR H2O_128mol 15.83 6.29 0.629
H2O_256mol 56.07 14.91 0.940
H2O_128mol taskgroup 13.02 7.80 0.417 taskgroup=4
H2O_256mol taskgroup 53.46 18.73 0.714 taskgroup=4
H2O_384mol taskgroup 125.24 41.60 0.753 taskgroup=4
H2O_512mol taskgroup 257.21 87.54 0.735 taskgroup=4


16 cores 64 cores 16 to 64
Cray X2 HECToR H2O_128mol



H2O_256mol



H2O_128mol taskgroup



H2O_256mol taskgroup



H2O_384mol taskgroup



H2O_512mol taskgroup





64 cores 256 cores 64 to 256
IBM BGP Babel H2O_128mol 7.64 4.39 0.435 taskgroup=1, SMP, MESH
H2O_256mol
17.08
taskgroup=1, DUAL, MESH
H2O_128mol taskgroup 6.30 2.80 0.562 taskgroup=4, SMP, MESH
H2O_256mol taskgroup 26.47 16.37 0.404 taskgroup=4, DUAL, MESH
H2O_384mol taskgroup



H2O_512mol taskgroup





64 cores 256 cores 64 to 256
IBM PowerPC MareNostrum H2O_128mol 19.96


H2O_256mol 45.88


H2O_128mol taskgroup 10.02 21.58 0.116
H2O_256mol taskgroup 28.11 39.85 0.176
H2O_384mol taskgroup 67.58 43.52 0.388
H2O_512mol taskgroup 128.84 69.40 0.464


64 cores 256 cores 64 to 256
IBM BGP JUGENE H2O_128mol 21.67 8.52 0.636
H2O_256mol 87.66 22.87 0.958
H2O_128mol taskgroup 17.32 6.99 0.619
H2O_256mol taskgroup 76.43 25.86 0.739
H2O_384mol taskgroup
65.37

H2O_512mol taskgroup





64 cores 256 cores 64 to 256
IBM SP6 CINECA
ST mode
H2O_128mol
18.57

H2O_256mol



H2O_128mol taskgroup
59.97

H2O_256mol taskgroup



H2O_384mol taskgroup



H2O_512mol taskgroup





64 cores 256 cores 64 to 256
IBM SP6 CINECA
SMT mode
H2O_128mol



H2O_256mol



H2O_128mol taskgroup



H2O_256mol taskgroup



H2O_384mol taskgroup



H2O_512mol taskgroup





64 cores 256 cores 64 to 256
Intel Nehalem JuRoPA H2O_128mol 7.21 16.91 0.107
H2O_256mol 25.94 34.85 0.186
H2O_128mol taskgroup 5.96 2.92 0.510
H2O_256mol taskgroup 25.60 8.62 0.742
H2O_384mol taskgroup 64.49 20.57 0.784
H2O_512mol taskgroup 123.61 38.63 0.800


16 cores 64 cores 16 to 64
NEC SX9 HLRS H2O_128mol



H2O_256mol



H2O_128mol taskgroup



H2O_256mol taskgroup



H2O_384mol taskgroup



H2O_512mol taskgroup





64 cores 256 cores 64 to 256
Dell PowerEdge Ekman H2O_128mol



H2O_256mol



H2O_128mol taskgroup



H2O_256mol taskgroup



H2O_384mol taskgroup



H2O_512mol taskgroup





72 cores 264 cores 72 to 264
Cray XT5 Rosa H2O_128mol 20.93 7.91 0.722
H2O_256mol 62.84 21.64 0.792
H2O_128mol taskgroup 14.52 8.44 0.469
H2O_256mol taskgroup 58.69 22.24 0.720
H2O_384mol taskgroup 128.76 62.80 0.559
H2O_512mol taskgroup 210.81 110.19 0.522


72 cores 264 cores 72 to 264
Cray XE6 HECToR H2O_128mol 12.03 6.31 0.520
H2O_256mol 45.47 13.40 0.925
H2O_128mol taskgroup 10.79 4.51 0.652
H2O_256mol taskgroup 44.90 13.92 0.880
H2O_384mol taskgroup 111.22 34.05 0.891
H2O_512mol taskgroup 209.73 68.21 0.839

 

Fenfloss Datasets Time (in seconds) Efficiency Comment


64 cores 512 cores 64 to 512
SGI Altix HLRB2
high-density
Cavity_224 227.57 43.07 0.660


64 cores 512 cores 64 to 512
SGI Altix HLRB2
high-bandwidth
Cavity_224 204.79 39.82 0.643


64 cores 512 cores 64 to 512
IBM P6 vip
ST mode
Cavity_224 50.48 11.84 0.533


64 cores 512 cores 64 to 512
IBM P6 vip
SMT mode
Cavity_224 51.56 52.43 0.123


64 cores 512 cores 64 to 512
Cray XT4 HECToR Cavity_224 122.29 16.22 0.942


16 cores 64 cores 16 to 64
Cray X2 HECToR Cavity_224 123.96 33.56 0.923


64 cores 512 cores 64 to 512
IBM BGP Babel Cavity_224





64 cores 512 cores 64 to 512
IBM PowerPC MareNostrum Cavity_224 183.90 36.43 0.631


64 cores 512 cores 64 to 512
IBM BGP JUGENE Cavity_224 310.33 49.12 0.790 XYZT, SMP (64 cores), DUAL (128 cores), VN (other)


64 cores 512 cores 64 to 512
IBM SP6 CINECA
ST mode
Cavity_224 50.58 7.26 0.871


64 cores 512 cores 64 to 512
IBM SP6 CINECA
SMT mode
Cavity_224 36.14 7.83 0.577


64 cores 512 cores 64 to 512
Intel Nehalem JuRoPA Cavity_224 51.17 7.53 0.849


16 cores 64 cores 16 to 64
NEC SX9 HLRS Cavity_224 31.32 10.38 0.754


64 cores 512 cores 64 to 512
Dell PowerEdge Ekman Cavity_224





72 cores 516 cores 72 to 516
Cray XT5 Rosa Cavity_224





72 cores 528 cores 72 to 528
Cray XE6 Rosa Cavity_224



 

GADGET Datasets Time (in seconds) Efficiency Comment


512 cores 1024 cores 512 to 1024
SGI Altix HLRB2
high-density
small, no IO



medium, no IO 30.58 18.64 0.820
large, no IO





512 cores 1024 cores 512 to 1024
SGI Altix HLRB2
high-bandwidth
small, no IO



medium, no IO 29.29 17.11 0.856
large, no IO





512 cores 1024 cores 512 to 1024
IBM P6 vip
ST mode
small, no IO 1.61


medium, no IO 18.83 9.42 0.999
large, no IO
215.28



512 cores 1024 cores 512 to 1024
IBM P6 vip
SMT mode
small, no IO



medium, no IO



large, no IO





512 cores 1024 cores 512 to 1024
Cray XT4 HECToR small, no IO 2.25


medium, no IO 16.54 9.16 0.903
large, no IO 430.10 161.97 1.328


16 cores 64 cores 16 to 64
Cray X2 HECToR small, no IO



medium, no IO



large, no IO





512 cores 1024 cores 512 to 1024
IBM BGP Babel small, no IO



medium, no IO 77.27 37.64 1.026
large, no IO





512 cores 1024 cores 512 to 1024
IBM PowerPC MareNostrum small, no IO 8.70


medium, no IO 52.74

2 tasks per node (64 cores), 4 tasks per node (other)
large, no IO





512 cores 1024 cores 512 to 1024
IBM BGP JUGENE small, no IO



medium, no IO 80.26 39.44 1.017 VN mode
large, no IO





512 cores 1024 cores 512 to 1024
IBM SP6 CINECA
ST mode
small, no IO 1.78


medium, no IO 20.71 10.55 0.982
large, no IO 476.12 232.97 1.022


512 cores 1024 cores 512 to 1024
IBM SP6 CINECA
SMT mode
small, no IO 2.01


medium, no IO 13.62 12.05 0.565
large, no IO 262.67 137.59 0.955


512 cores 1024 cores 512 to 1024
Intel Nehalem JuRoPA small, no IO 1.26


medium, no IO 9.65 5.56 0.868
large, no IO 179.62 87.75 1.023


16 cores 64 cores 16 to 64
NEC SX9 HLRS small, no IO



medium, no IO



large, no IO





512 cores 1024 cores 512 to 1024
Dell PowerEdge Ekman small, no IO 3.85


medium, no IO 19.20 9.95 0.965
large, no IO





516 cores 1032 cores 516 to 1032
Cray XT5 Rosa small, no IO



medium, no IO 19.62 12.39 0.792
large, no IO





528 cores 1032 cores 528 to 1032
Cray XE6 HECToR small, no IO 4.74


medium, no IO 19.76 19.66 0.514
large, no IO



 

GENE Datasets Time (in seconds) Efficiency Comment


512 cores 1024 cores 512 to 1024
SGI Altix HLRB2
high-density
strong_512 8.22 4.36 0.943


512 cores 1024 cores 512 to 1024
SGI Altix HLRB2
high-bandwidth
strong_512 6.98 5.05 0.691


512 cores 1024 cores 512 to 1024
IBM P6 vip
ST mode
strong_512 2.77 1.41 0.982


512 cores 1024 cores 512 to 1024
IBM P6 vip
SMT mode
strong_512 2.47 1.3 0.950


512 cores 1024 cores 512 to 1024
Cray XT4 HECToR strong_512 4.977 2.534 0.982


16 cores 64 cores 16 to 64
Cray X2 HECToR strong_512





512 cores 1024 cores 512 to 1024
IBM BGP Babel strong_512 19.7 10.18 0.968


512 cores 1024 cores 512 to 1024
IBM PowerPC MareNostrum strong_512 10.3 5.82 0.885


512 cores 1024 cores 512 to 1024
IBM BGP JUGENE strong_512 20.05 10.14 0.989


512 cores 1024 cores 512 to 1024
IBM SP6 CINECA
ST mode
strong_512 2.96 1.67 0.886


512 cores 1024 cores 512 to 1024
IBM SP6 CINECA
SMT mode
strong_512 2.47 1.41 0.876


512 cores 1024 cores 512 to 1024
Intel Nehalem JuRoPA strong_512 3.034 1.556 0.975


16 cores 64 cores 16 to 64
NEC SX9 HLRS strong_512





512 cores 1024 cores 512 to 1024
Dell PowerEdge Ekman strong_512





516 cores 1032 cores 516 to 1032
Cray XT5 Rosa strong_512 4.565 2.469 0.924


528 cores 1032 cores 528 to 1032
Cray XE6 HECToR strong_512 4.536 2.33 0.996

 

IFS Datasets Time (in seconds) Efficiency Comment


512 cores 2048 cores 512 to 2048
SGI Altix HLRB2
high-density
T159



T799



T1279





512 cores 2048 cores 512 to 2048
SGI Altix HLRB2
high-bandwidth
T159



T799



T1279





512 cores 2048 cores 512 to 2048
IBM P6 vip
ST mode
T159 0.48

2 way OpenMP
T799 12.51 3.44 0.909 2 way OpenMP
T1279 47.10 12.66 0.930 2 way OpenMP


512 cores 2048 cores 512 to 2048
IBM P6 vip
SMT mode
T159 0.36

4 way OpenMP
T799 8.7 2.74 0.794 4 way OpenMP
T1279 32.79 9.47 0.866 4 way OpenMP


512 cores 2048 cores 512 to 2048
Cray XT4 HECToR T159 0.79 0.4 0.494 4 way OpenMP
T799 21.1 6.24 0.845 4 way OpenMP
T1279 81.59 22.31 0.914 4 way OpenMP


16 cores 64 cores 16 to 64
Cray X2 HECToR T159



T799



T1279





512 cores 2048 cores 512 to 2048
IBM BGP Babel T159 2.33 1.22 0.477 SMP, MESH
T799 62.99 18.01 0.874 SMP, TORUS(4096 cores), MESH(rest)
T1279
65.16
SMP, TORUS(4096 cores), MESH(rest)


512 cores 2048 cores 512 to 2048
IBM PowerPC MareNostrum T159 1.63

4 way OpenMP
T799 36.84

4 way OpenMP
T1279 143.28

4 way OpenMP


512 cores 2048 cores 512 to 2048
IBM BGP JUGENE T159 2.28 1.25 0.456 SMP, MESH
T799 63.12 18.01 0.876 SMP, TORUS(4096 cores), MESH(rest)
T1279
65.21
SMP, TORUS(4096 cores), MESH(rest)


512 cores 2048 cores 512 to 2048
IBM SP6 CINECA
ST mode
T159 0.48

2 way OpenMP
T799 12.62 3.6 0.876 2 way OpenMP
T1279 47.83 12.75 0.938 2 way OpenMP


512 cores 2048 cores 512 to 2048
IMB SP6 CINECA
SMT mode
T159 0.36

4 way OpenMP
T799 8.72 2.73 0.799 4 way OpenMP
T1279 33.31 9.73 0.856 4 way OpenMP


512 cores 2048 cores 512 to 2048
Intel Nehalem JuRoPA T159 0.45

4 way OpenMP
T799 9.8 2.71 0.904 4 way OpenMP
T1279 39.17 10.48 0.934 4 way OpenMP


16 cores 64 cores 16 to 64
NEC SX9 HLRS T159



T799



T1279





512 cores 2048 cores 512 to 2048
Dell PowerEdge Ekman T159 0.81

4 way OpenMP
T799 18.75 6.64 0.706 4 way OpenMP
T1279 72.09 19.6 0.920 4 way OpenMP


516 cores 2052 cores 516 to 2052
Cray XT5 Rosa T159 0.89

6 way OpenMP
T799 22.41 6.35 0.887 6 way OpenMP
T1279 85.14 22.92 0.934 6 way OpenMP


528 cores 2064 cores 528 to 2064
Cray XE6 HECToR T159 0.82 0.41 0.512 6 way OpenMP
T799 22.54 6.42 0.898 6 way OpenMP
T1279 88.73 24.93 0.910 6 way OpenMP

 

IQCS Datasets Time (in seconds) Efficiency Comment


256 cores 1024 cores 256 to 1024
SGI Altix HLRB2
high-density
28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 31.93


33 Qubits



34 Qubits





256 cores 1024 cores 256 to 1024
SGI Altix HLRB2
high-bandwidth
28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 19.31


33 Qubits



34 Qubits





256 cores 1024 cores 256 to 1024
IBM P6 vip
ST mode
28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 16.83 5.82 0.723
33 Qubits 32.76 12.53 0.654
34 Qubits 68.68 25.29 0.679


256 cores 1024 cores 256 to 1024
IBM P6 vip
SMT mode
28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 17.55 7.31 0.600
33 Qubits 36.78 12.67 0.726
34 Qubits 74.39 25.96 0.716


256 cores 1024 cores 256 to 1024
Cray XT4 HECToR 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 35.32 14.88 0.593
33 Qubits 71.57 29.97 0.597
34 Qubits 140.99 56.86 0.620


16 cores 64 cores 16 to 64
Cray X2 HECToR 28 Qubits 9.72 2.70 0.900
29 Qubits 19.39 5.30 0.915
30 Qubits 38.88 10.51 0.925
31 Qubits 77.70 21.00 0.925
32 Qubits/1



32 Qubits/2



32 Qubits/3



33 Qubits



34 Qubits





256 cores 1024 cores 256 to 1024
IBM BGP Babel 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1 69.68 28.98 0.601 VN, MESH (256 to 1024 cores), TORUS (other)
32 Qubits/2 52.43 16.01 0.819 DUAL, MESH (128 to 512 cores), TORUS (other)
32 Qubits/3 38.91 11.89 0.818 SMP, MESH (64 to 256 cores), TORUS (other)
33 Qubits 78.60 24.09 0.816 SMP, MESH (128 to 256 cores), TORUS (other)
34 Qubits 159.39 48.38 0.824 SMP, MESH (256 cores), TORUS (other)


256 cores 1024 cores 256 to 1024
IBM PowerPC MareNostrum 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 81.92


33 Qubits 172.83


34 Qubits 317.61




256 cores 1024 cores 256 to 1024
IBM BGP JUGENE 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1 72.07 29.36 0.614 VN, MESH (256 to 1024 cores), TORUS (other)
32 Qubits/2 52.42 15.82 0.828 DUAL, MESH (128 to 512 cores), TORUS (other)
32 Qubits/3 38.83 11.17 0.869 SMP, MESH (64 to 256 cores), TORUS (other)
33 Qubits 79.00 22.54 0.876 SMP, MESH (128 to 256 cores), TORUS (other)
34 Qubits 159.50 45.49 0.877 SMP, MESH (256 cores), TORUS (other)


256 cores 1024 cores 256 to 1024
IBM SP6 CINECA
ST mode
28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 16.94 6.22 0.681
33 Qubits 33.18 12.66 0.655
34 Qubits 67.58 23.76 0.711


256 cores 1024 cores 256 to 1024
IBM SP6 CINECA
SMT mode
28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 17.93 7.79 0.575
33 Qubits 36.25 12.68 0.715
34 Qubits 71.04 25.36 0.700


256 cores 1024 cores 256 to 1024
Intel Nehalem JuRoPA 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 15.22 5.91 0.644
33 Qubits 32.09 11.28 0.711
34 Qubits 64.34 23.42 0.687


16 cores 64 cores 16 to 64
NEC SX9 HLRS 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3



33 Qubits



34 Qubits





256 cores 1024 cores 256 to 1024
Dell PowerEdge Ekman 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 35.21 14.10 0.624
33 Qubits 70.53 26.30 0.670
34 Qubits 149.65 50.20 0.745


264 cores 1032 cores 264 to 1032
Cray XT5 Rosa 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 51.55 19.52 0.676
33 Qubits 101.44 39.13 0.663
34 Qubits
69.65



264 cores 1032 cores 264 to 1032
Cray XE6 HECToR 28 Qubits



29 Qubits



30 Qubits



31 Qubits



32 Qubits/1



32 Qubits/2



32 Qubits/3 37.64 21.55 0.447
33 Qubits 75.86 30.37 0.639
34 Qubits
68.55

 

NAMD Datasets Time (in seconds) Efficiency Comment


64 cores 256 cores 64 to 256
SGI Altix HLRB2
high-density
APOA1



1CQY



1GND



4f2hc



NUCLEOSOME



MEMBRANE





64 cores 256 cores 64 to 256
SGI Altix HLRB2
high-bandwidth
APOA1



1CQY



1GND



4f2hc



NUCLEOSOME



MEMBRANE





64 cores 256 cores 64 to 256
IBM P6 vip
ST mode
APOA1 21.13 11.70 0.451
1CQY 849.15


1GND 1562.27 742.19 0.526
4f2hc 10452.62 3381.71 0.773
NUCLEOSOME 86.58 29.96 0.722
MEMBRANE 190.68 121.96 0.391


64 cores 256 cores 64 to 256
IBM P6 vip
SMT mode
APOA1 19.30 12.22 0.395
1CQY 786.79


1GND 1210.34


4f2hc 8136.58


NUCLEOSOME 58.47 29.47 0.496
MEMBRANE 165.50 104.59 0.396


64 cores 256 cores 64 to 256
Cray XT4 HECToR APOA1
13.02

1CQY
766.36

1GND
1308.75

4f2hc
12470.98

NUCLEOSOME 173.15 68.18824 0.635
MEMBRANE 270.59 141.365784 0.479


16 cores 64 cores 16 to 64
Cray X2 HECToR APOA1



1CQY



1GND



4f2hc



NUCLEOSOME



MEMBRANE





64 cores 256 cores 64 to 256
IBM BGP Babel APOA1 132.32 45.51 0.727 DUAL, MESH
1CQY 3605.33

DUAL, MESH
1GND 6967.56 3125.24 0.557 DUAL, MESH
4f2hc
11068.03
DUAL, MESH
NUCLEOSOME
114.91
DUAL, MESH
MEMBRANE
412.54
DUAL, MESH


64 cores 256 cores 64 to 256
IBM PowerPC MareNostrum APOA1 112.51 36.42 0.772
1CQY



1GND



4f2hc



NUCLEOSOME 351.18 211.89 0.414
MEMBRANE 623.61 372.66 0.418


64 cores 256 cores 64 to 256
IBM BGP JUGENE APOA1 116.36 45.34 0.642 DUAL, MESH
1CQY



1GND



4f2hc



NUCLEOSOME
114.38
DUAL, MESH
MEMBRANE
455.58
DUAL, MESH


64 cores 256 cores 64 to 256
IBM SP6 CINECA
ST mode
APOA1



1CQY



1GND



4f2hc



NUCLEOSOME 77.53 30.02 0.646
MEMBRANE 192.77 118.58 0.406


64 cores 256 cores 64 to 256
IBM SP6 CINECA
SMT mode
APOA1



1CQY



1GND



4f2hc



NUCLEOSOME 87.84 35.86 0.612
MEMBRANE 218.58 124.86 0.438


64 cores 256 cores 64 to 256
Intel Nehalem JuRoPA APOA1



1CQY



1GND



4f2hc



NUCLEOSOME 63.61 49.06 0.324
MEMBRANE 157.91 170.70 0.231


16 cores 64 cores 16 to 64
NEC SX9 HLRS APOA1



1CQY



1GND



4f2hc



NUCLEOSOME



MEMBRANE





64 cores 256 cores 64 to 256
Dell PowerEdge Ekman APOA1



1CQY



1GND



4f2hc



NUCLEOSOME 91.21 35.32 0.646
MEMBRANE 184.78 98.11 0.471


72 cores 264 cores 72 to 264
Cray XT5 Rosa APOA1



1CQY



1GND



4f2hc



NUCLEOSOME 130.60 52.59 0.677
MEMBRANE 264.86 162.07 0.446


72 cores 264 cores 72 to 264
Cray XE6 HECToR APOA1



1CQY



1GND



4f2hc



NUCLEOSOME



MEMBRANE



 

NEMO Datasets Time (in seconds) Efficiency Comment


512 cores 1024 cores 512 to 1024
SGI Altix HLRB2
high-density
GYRE.25, no IO 219.00


GYRE.25, IO 256.00


GYRE.50, no IO 482.00 752.00 0.320
GYRE.50, IO 640.00 960.00 0.333
GYRE.150, no IO
1963.00

GYRE.150, IO
2021.00



512 cores 1024 cores 512 to 1024
SGI Altix HLRB2
high-bandwidth
GYRE.25, no IO 212.00


GYRE.25, IO 355.00


GYRE.50, no IO 414.00 807.00 0.257
GYRE.50, IO 1034.00 807.00 0.641
GYRE.150, no IO
1079.00

GYRE.150, IO
1715.00



512 cores 1024 cores 512 to 1024
IBM P6 vip
ST mode
GYRE.25, no IO 172.00 182.00 0.473
GYRE.25, IO 204.00 192.00 0.531
GYRE.50, no IO 323.00 276.00 0.585 Utilisation of asynchroneous send "c_mpi_send = I"
GYRE.50, IO 402.00 391.00 0.514 Utilisation of asynchroneous send "c_mpi_send = I"
GYRE.150, no IO
337.00

GYRE.150, IO
519.00



512 cores 1024 cores 512 to 1024
IBM P6 vip
SMT mode
GYRE.25, no IO 261.00 337.00 0.387
GYRE.25, IO 343.00 430.00 0.399
GYRE.50, no IO 380.00 383.00 0.496
GYRE.50, IO 550.00 712.00 0.386
GYRE.150, no IO



GYRE.150, IO





512 cores 1024 cores 512 to 1024
Cray XT4 HECToR GYRE.25, no IO 266.00 340.00 0.391
GYRE.25, IO 317.00 273.00 0.581
GYRE.50, no IO 687.00 475.00 0.723
GYRE.50, IO 891.00 658.00 0.677
GYRE.150, no IO 2430.00 998.00 1.217
GYRE.150, IO 2592.00 1263.00 1.026


16 cores 64 cores 16 to 64
Cray X2 HECToR GYRE.25, no IO



GYRE.25, IO



GYRE.50, no IO



GYRE.50, IO



GYRE.150, no IO



GYRE.150, IO





512 cores 1024 cores 512 to 1024
IBM BGP Babel GYRE.25, no IO 445.00 401.00 0.555 VN
GYRE.25, IO 738.00 561.00 0.658 VN
GYRE.50, no IO 1625.00 1035.00 0.785 DUAL (128 cores), VN (256 to 2048 cores)
GYRE.50, IO 2290.00 1571.00 0.729 DUAL (128 cores), VN (256 to 2048 cores)
GYRE.150, no IO 4348.00 2319.00 0.937 SMP (512 cores), DUAL (1024 to 2048 cores), VN(4096 cores)
GYRE.150, IO 4592.00 2577.00 0.891 SMP (512 cores), DUAL (1024 to 2048 cores), VN(4096 cores)


512 cores 1024 cores 512 to 1024
IBM PowerPC MareNostrum GYRE.25, no IO 297.00 266.00 0.558
GYRE.25, IO 390.00 406.00 0.480
GYRE.50, no IO 981.00 752.00 0.652
GYRE.50, IO 1212.00 1142.00 0.531
GYRE.150, no IO 4095.00


GYRE.150, IO 4516.00 3131.00 0.721


512 cores 1024 cores 512 to 1024
IBM BGP JUGENE GYRE.25, no IO 406.00 373.00 0.544 VN
GYRE.25, IO 568.00 418.00 0.679 VN
GYRE.50, no IO 1388.00 888.00 0.782 DUAL (128 cores), VN (256 to 2048 cores)
GYRE.50, IO 1752.00 1104.00 0.793 DUAL (128 cores), VN (256 to 2048 cores)
GYRE.150, no IO 3626.00 1874.00 0.967 SMP (512 cores), DUAL (1024 to 2048 cores), VN(4096 cores)
GYRE.150, IO 4317.00 2537.00 0.851 SMP (512 cores), DUAL (1024 to 2048 cores), VN(4096 cores)


512 cores 1024 cores 512 to 1024
IBM SP6 CINECA
ST mode
GYRE.25, no IO 89.00 74.00 0.601
GYRE.25, IO 99.00 87.00 0.569
GYRE.50, no IO 209.00 164.00 0.637
GYRE.50, IO 264.00 239.00 0.552
GYRE.150, no IO
276.00

GYRE.150, IO
359.00



512 cores 1024 cores 512 to 1024
IBM SP6 CINECA
SMT mode
GYRE.25, no IO 191.00 421.00 0.227
GYRE.25, IO 276.00 489.00 0.282
GYRE.50, no IO 444.00 545.00 0.407
GYRE.50, IO 648.00 748.00 0.433
GYRE.150, no IO



GYRE.150, IO





512 cores 1024 cores 512 to 1024
Intel Nehalem JuRoPA GYRE.25, no IO 122.00 289.00 0.211
GYRE.25, IO 251.00 310.00 0.405
GYRE.50, no IO 352.00 352.00 0.500
GYRE.50, IO 474.00 521.00 0.455
GYRE.150, no IO 901.00 571.00 0.789
GYRE.150, IO 1061.00 653.00 0.812


16 cores 64 cores 16 to 64
NEC SX9 HLRS GYRE.25, no IO



GYRE.25, IO



GYRE.50, no IO



GYRE.50, IO



GYRE.150, no IO



GYRE.150, IO





512 cores 1024 cores 512 to 1024
Dell PowerEdge Ekman GYRE.25, no IO 143.00 123.00 0.581
GYRE.25, IO 861.00 1495.00 0.288
GYRE.50, no IO 515.00 286.00 0.900
GYRE.50, IO 1697.00 3160.00 0.269
GYRE.150, no IO 1782.00 1026.00 0.868
GYRE.150, IO 3538.00 2457.00 0.720


516 cores 1032 cores 516 to 1032
Cray XT5 Rosa GYRE.25, no IO 491.00 455.00 0.540
GYRE.25, IO 559.00 527.00 0.530
GYRE.50, no IO 1044.00 779.00 0.670
GYRE.50, IO 1063.00 823.00 0.646
GYRE.150, no IO 2037.00 993.00 1.026
GYRE.150, IO 2119.00 1055.00 1.004


528 cores 1032 cores 528 to 1032
Cray XE6 HECToR GYRE.25, no IO 181.00 152.00 0.609
GYRE.25, IO 242.00 269.00 0.460
GYRE.50, no IO 611.00 383.00 0.816
GYRE.50, IO 906.00 612.00 0.757
GYRE.150, no IO 1872.00 848.00 1.129
GYRE.150, IO 2104.00 1119.00 0.962

 

PEPC Datasets Time (in seconds) Efficiency Comment


64 cores 512 cores 64 to 512
SGI Altix HLRB2
high-density
Sphere 5M



Sphere 15M



Sphere 25M





64 cores 512 cores 64 to 512
SGI Altix HLRB2
high-bandwidth
Sphere 5M



Sphere 15M



Sphere 25M





64 cores 512 cores 64 to 512
IBM P6 vip
ST mode
Sphere 5M 8.29 1.34 0.773
Sphere 15M 27.48 3.95 0.870
Sphere 25M 48.02 6.80 0.883


64 cores 512 cores 64 to 512
IBM P6 vip
SMT mode
Sphere 5M 5.64 1.81 0.390
Sphere 15M 18.09 3.45 0.655
Sphere 25M 31.75 5.34 0.743


64 cores 512 cores 64 to 512
Cray XT4 HECToR Sphere 5M 15.04 2.62 0.718
Sphere 15M 48.73 7.61 0.800
Sphere 25M 84.45 13.28 0.795


16 cores 64 cores 16 to 64
Cray X2 HECToR Sphere 5M



Sphere 15M



Sphere 25M





64 cores 512 cores 64 to 512
IBM BGP Babel Sphere 5M
9.17
VN, TXYZ, MESH (256 to 1024 cores), TORUS (other)
Sphere 15M
28.72
VN, TXYZ, MESH (256 to 1024 cores), TORUS (other)
Sphere 25M
49.23
VN, TXYZ, MESH (256 to 1024 cores), TORUS (other)


64 cores 512 cores 64 to 512
IBM PowerPC MareNostrum Sphere 5M 17.36 3.54 0.613
Sphere 15M 57.94 10.01 0.724
Sphere 25M





64 cores 512 cores 64 to 512
IBM BGP JUGENE Sphere 5M
9.18
VN, TXYZ, MESH (256 to 1024 cores), TORUS (other)
Sphere 15M
28.73
VN, TXYZ, MESH (256 to 1024 cores), TORUS (other)
Sphere 25M
49.25
VN, TXYZ, MESH (256 to 1024 cores), TORUS (other)


64 cores 512 cores 64 to 512
IBM SP6 CINECA
ST mode
Sphere 5M 8.23 1.52 0.677
Sphere 15M 27.58 4.11 0.839
Sphere 25M 47.95 6.90 0.869


64 cores 512 cores 64 to 512
IBM SP6 CINECA
SMT mode
Sphere 5M 5.59 2.03 0.344
Sphere 15M 18.23 3.60 0.633
Sphere 25M 31.94 6.22 0.642


64 cores 512 cores 64 to 512
Intel Nehalem JuRoPA Sphere 5M 9.43 1.89 0.624
Sphere 15M 31.26 5.00 0.782
Sphere 25M
8.29



16 cores 64 cores 16 to 64
NEC SX9 HLRS Sphere 5M



Sphere 15M



Sphere 25M





64 cores 512 cores 64 to 512
Dell PowerEdge Ekman Sphere 5M 14.70 3.17 0.580
Sphere 15M 47.59 8.16 0.729
Sphere 25M 81.98 13.76 0.745


72 cores 516 cores 72 to 516
Cray XT5 Rosa Sphere 5M 12.39 2.58 0.670
Sphere 15M 39.51 8.19 0.673
Sphere 25M 67.37 13.24 0.710


72 cores 528 cores 72 to 528
Cray XE6 HECToR Sphere 5M 13.27 1.98 0.914
Sphere 15M 43.05 7.37 0.797
Sphere 25M 73.83 12.84 0.784

 

QuantumESPRESSO Datasets Time (in seconds) Efficiency Comment


64 cores 256 cores 64 to 256
SGI Altix HLRB2
high-density
AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
SGI Altix HLRB2
high-bandwidth
AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
IBM P6 vip
ST mode
AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
IBM P6 vip
SMT mode
AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
Cray XT4 HECToR AUSURF112



PSIWAT





16 cores 64 cores 16 to 64
Cray X2 HECToR AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
IBM BGP Babel AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
IBM PowerPC MareNostrum AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
IBM BGP JUGENE AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
IBM SP6 CINECA
ST mode
AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
IBM SP6 CINECA
SMT mode
AUSURF112



PSIWAT





64 cores 256 cores 64 to 256
Intel Nehalem JuRoPA AUSURF112 240.2 111.73 0.537 PSP_ONDEMAND=0
PSIWAT 1015.42 331.32 0.766 PSP_ONDEMAND=0


16 cores 64 cores 16 to 64
NEC SX9 HLRS AUSURF112 422.20 248.49 0.425
PSIWAT 958.50 418.26 0.573


64 cores 256 cores 64 to 256
Dell PowerEdge Ekman AUSURF112 480.87 177.24 0.678 mpi_paffinity_alone=1
PSIWAT
638.25
mpi_paffinity_alone=1


72 cores 264 cores 72 to 264
Cray XT5 Rosa AUSURF112 440.07 198.72 0.604
PSIWAT
550.99



72 cores 264 cores 72 to 264
Cray XE6 HECToR AUSURF112 417.60 126.74 0.899
PSIWAT
475.91

 

RAMSES Datasets Time (in seconds) Efficiency Comment


512 cores 2048 cores 512 to 2048
SGI Altix HLRB2
high-density
Sedov3D 62.00


AMR 550.00 687.00 0.200
Sedov3D 1024 440.00 115.00 0.957


512 cores 2048 cores 512 to 2048
SGI Altix HLRB2
high-bandwidth
Sedov3D 41.00


AMR 495.00 666.00 0.186
Sedov3D 1024 271.00 79.00 0.858


512 cores 2048 cores 512 to 2048
IBM P6 vip
ST mode
Sedov3D 31.02 7.29 1.064
AMR 422.41 939.15 0.112
Sedov3D 1024 228.44 58.15 0.982


512 cores 2048 cores 512 to 2048
IBM P6 vip
SMT mode
Sedov3D 24.45 7.26 0.842
AMR 586.36


Sedov3D 1024 193.87 63.06 0.769


512 cores 2048 cores 512 to 2048
Cray XT4 HECToR Sedov3D 58.57 14.87 0.985
AMR 739.36 885.99 0.209
Sedov3D 1024 461.88 117.14 0.986


16 cores 64 cores 16 to 64
Cray X2 HECToR Sedov3D



AMR



Sedov3D 1024





512 cores 2048 cores 512 to 2048
IBM BGP Babel Sedov3D 85.58 21.78 0.982 VN, TXYZ, MESH(64 to 1024 cores), TORUS(other)
AMR 1381.51 5069.09 0.068 SMP(64 cores), DUAL(128 to 256 cores), VN(other), TXYZ, MESH(64 to 1024 cores), TORUS(other)
Sedov3D 1024 667.95 169.08 0.988 DUAL(256 cores),VN(512 to 4096 cores), TXYZ, TORUS


512 cores 2048 cores 512 to 2048
IBM PowerPC MareNostrum Sedov3D 87.45


AMR 1432.90


Sedov3D 1024 677.61




512 cores 2048 cores 512 to 2048
IBM BGP JUGENE Sedov3D 85.98 21.81 0.986 VN
AMR 1376.38 5063.67 0.068 SMP(64 cores), DUAL(128 to 256 cores), VN(other)
Sedov3D 1024 672.57 170.03 0.989 DUAL(256 cores), VN(other)


512 cores 2048 cores 512 to 2048
IBM SP6 CINECA
ST mode
Sedov3D 25.87 6.74 0.960
AMR 431.75 901.33 0.120
Sedov3D 1024 201.53 53.47 0.942


512 cores 2048 cores 512 to 2048
IBM SP6 CINECA
SMT mode
Sedov3D 24.01 6.85 0.876
AMR 612.57 3157.17 0.049
Sedov3D 1024 182.32 47.21 0.965


512 cores 2048 cores 512 to 2048
Intel Nehalem JuRoPA Sedov3D 23.11 5.83 0.991
AMR 387.41


Sedov3D 1024 180.51 45.80 0.985


16 cores 64 cores 16 to 64
NEC SX9 HLRS Sedov3D 308.73 82.37 0.937
AMR 10949.07 3235.09 0.846
Sedov3D 1024 2510.97 631.15 0.995


512 cores 2048 cores 512 to 2048
Dell PowerEdge Ekman Sedov3D 50.42 21.67 0.582
AMR 922.11 1865.17 0.124
Sedov3D 1024 683.90 173.97 0.983


516 cores 2052 cores 516 to 2052
Cray XT5 Rosa Sedov3D 57.93 14.83 0.982
AMR 937.14 1022.70 0.230
Sedov3D 1024 456.54 115.01 0.998


528 cores 2064 cores 528 to 2064
Cray XE6 HECToR Sedov3D 50.03 12.38 1.034
AMR 586.15 404.63 0.371
Sedov3D 1024 393.79 99.80 1.009

 

SU3_AHiggs Datasets Time (in seconds) Efficiency Comment


512 cores 2048 cores 512 to 2048
SGI Altix HLRB2
high-density
323 lattice,
10 000 iterations
36.00


2563 lattice,
100 iterations
184.50




512 cores 2048 cores 512 to 2048
SGI Altix HLRB2
high-bandwidth
323 lattice,
10 000 iterations
36.30 22.50 0.403
2563 lattice,
100 iterations
125.50 26.70 1.175


512 cores 2048 cores 512 to 2048
IBM P6 vip
ST mode
2563 lattice,
100 iterations
25.70


2563 lattice,
100 iterations
77.90 19.40 1.004


512 cores 2048 cores 512 to 2048
IBM P6 vip
SMT mode
2563 lattice,
100 iterations
25.70


2563 lattice,
100 iterations
44.40 19.00 0.584


512 cores 2048 cores 512 to 2048
Cray XT4 HECToR 323 lattice,
10000 iterations
54.30


2563 lattice,
100 iterations
78.40 19.20 1.021


16 cores 64 cores 16 to 64
Cray X2 HECToR 323 lattice,
10000 iterations




2563 lattice,
100 iterations






512 cores 2048 cores 512 to 2048
IBM BGP Babel 323 lattice,
10000 iterations
154.90

VN, PREFER_TORUS, TXYZ
2563 lattice,
100 iterations
424.60 107.80 0.985 VN, PREFER_TORUS, TXYZ


512 cores 2048 cores 512 to 2048
IBM PowerPC MareNostrum 323 lattice,
10000 iterations
56.00


2563 lattice,
100 iterations
117.90




512 cores 2048 cores 512 to 2048
IBM BGP JUGENE 323 lattice,
10000 iterations
155.90

VN, PREFER_TORUS, TXYZ
2563 lattice,
100 iterations
416.20 104.90 0.992 VN, PREFER_TORUS, TXYZ


512 cores 2048 cores 512 to 2048
IBM SP6 CINECA
ST mode
323 lattice,
10000 iterations
29.40


2563 lattice,
100 iterations
84.60 21.50 0.984


512 cores 2048 cores 512 to 2048
IBM SP6 CINECA
SMT mode
323 lattice,
10000 iterations
24.70


2563 lattice,
100 iterations
43.10 11.40 0.945


512 cores 2048 cores 512 to 2048
Intel Nehalem JuRoPA 323 lattice,
10000 iterations
13.90

PSP_ONDEMAND=0
2563 lattice,
100 iterations
51.10 12.50 1.022 PSP_ONDEMAND=0


16 cores 64 cores 16 to 64
NEC SX9 HLRS 323 lattice,
10000 iterations




2563 lattice,
100 iterations






512 cores 2048 cores 512 to 2048
Dell PowerEdge Ekman 323 lattice,
10000 iterations
23.50

mpi_paffinity_alone=1
2563 lattice,
100 iterations
92.30 22.40 1.030 mpi_paffinity_alone=1


516 cores 2052 cores 516 to 2052
Cray XT5 Rosa 323 lattice,
10000 iterations
132.40


2563 lattice,
100 iterations
84.10 19.80 1.068


528 cores 2064 cores 528 to 2064
Cray XE6 HECToR 323 lattice,
10000 iterations
30.30


2563 lattice,
100 iterations
77.90 18.80 1.060

 

Compiler info and libraries

 

BQCD Compiler Compiler options Libraries
SGI Altix HLRB2 mpif90, ifort, mpicc, icc, intel: 9.1 -O2 -openmp
IBM P6 vip mpxlf90_r, mpcc_r -qsmp -qnosave -qsuffix=f=f90 -q64 (for mpcc_r: -qsmp -qnosave) [add:" -qsmp=omp " for hybrid BQCD]
Cray XT4 HECToR pgf90, pgcc (pgi) and ftn (pathscale) -O3 -OPT:Ofast (pathscale) or -fastsse (pgi) , for hybrid add:-mp (pathscale) or -mp=nonuma (pgi)
Cray X2 HECToR


IBM BGP Babel mpixlf90_r, mpixlc_r -qarch=450 -qtune=450 -O3 -qstrict
IBM PowerPC MareNostrum mpif90, mpicc -qsuffix=f=f90 C16-q64 ( add -qsmp=omp for hybrid)
IBM BGP JUGENE mpixlf90_r, mpixlc _r -O3 -qstrict -qtune=450 -qarch=450 (add -qsmp=omp -qthreaded for hybrid)
IBM SP6 CINECA mpxlf90_r, mpcc_r -qsuffix=D16f=f90 -q64 ( D18add -qsmp=omp for hybrid)
Intel Nehalem JuRoPA mpif90, mpicc, icc, ifort -O3 -no-multibyte-chars ( add -openmp for hybrid)
NEC SX9 HLRS


Dell PowerEdge Ekman i-compilers/11.1, mpicc, mpif90 -fno-range-check -O3 -march=native -mtune=native
Cray XT5 Rosa PrgEnv-gnu, ftn -s -Wall -Wstrict-prototypes -fno-range-check -O3 -march=native -mtune=native
Cray XE6 HECToR pgf90, pgcc (pgi) and ftn -O3 -OPT:Ofast (pathscale) or -fastsse (pgi) , for hybrid add:-mp (pathscale) or -mp=nonuma (pgi)

 

CPMD Compiler Compiler options Libraries
SGI Altix HLRB2


IBM P6 vip


Cray XT4 HECToR


Cray X2 HECToR


IBM BGP Babel bgxlf_r -O3 -w -qsmp=omp -qnosave -qarch=450 -c lapack, essl
IBM PowerPC MareNostrum


IBM BGP JUGENE


IBM SP6 CINECA


Intel Nehalem JuRoPA


NEC SX9 HLRS


Dell PowerEdge Ekman


Cray XT5 Rosa


Cray XE6 HECToR


 

Fenfloss Compiler Compiler options Libraries
SGI Altix HLRB2 ifort 9.1 -O3 -ipo
IBM P6 vip mpxlf90_r V12.1 -q64 -O5
Cray XT4 HECToR pgf90/7.1.4 -fastsse -Msmart -Mipa=fast
Cray X2 HECToR ftn (x1x2-pe/6.0.0.1) -O3
IBM BGP Babel mpixlf90_r -O5
IBM PowerPC MareNostrum mpif90 V12.1 -q64 -O5 ( additional -qipa=level=1 for linking)
IBM BGP JUGENE mpixlf90_r V11.1 -O5 -qtune=450 -qarch=450 -qessl -lessl -qmaxmem=3145728 -qxflag=diagnostic ( additional -qipa=level=1 for linking)
IBM SP6 CINECA mpxlf90_r V12.1 -O5
Intel Nehalem JuRoPA mpif90 V 11.1.072 -O3
NEC SX9 HLRS sxmpif90 Rev.393 -C hopt -sx9 -pi exp=gba expin=gba.f90 -Wf'-pvctl loopcnt=1000000' -Wf,-pvctl on_adb
Dell PowerEdge Ekman


Cray XT5 Rosa


Cray XE6 HECToR


 

GADGET Compiler Compiler options Libraries
SGI Altix HLRB2 icc 9.1 -O2 fftw2, gsl
IBM P6 vip mpcc_r -q64 -qtune=pwr6 -qarch=pwr6 -O3 -qstrict -qcpluscmt -qipa fftw2, gsl
Cray XT4 HECToR pgcc -fastsse -O3 -Mipa=fast,inline fftw2, gsl, hdf5
Cray X2 HECToR


IBM BGP Babel mpixlc_r -O3 -qstrict -qarch=450 -qtune=450 -qcpluscm fftw2, gsl, hdf5
IBM PowerPC MareNostrum


IBM BGP JUGENE


IBM SP6 CINECA


Intel Nehalem JuRoPA


NEC SX9 HLRS


Dell PowerEdge Ekman


Cray XT5 Rosa


Cray XE6 HECToR


 

GENE Compiler Compiler options Libraries
SGI Altix HLRB2 ifort 9.1 -O3 -ip -ftz -align -fno-alias -r8 blas, fftw3
IBM P6 vip mpxlf90_r -q64 -qrealsize=8 -qtune=pwr6 -qarch=pwr6 -O3 -qsuffix=cpp=F90 -WF,-DWITHESSL,-DDOUBLE_PREC essl
Cray XT4 HECToR ftn (pgi) -r8 -fast -Mipa=fast -Minline -Minfo -DDOUBLE_PREC
Cray X2 HECToR


IBM BGP Babel bgxlf90_r -qtune=450 -qarch=450d esslbg
IBM PowerPC MareNostrum


IBM BGP JUGENE


IBM SP6 CINECA


Intel Nehalem JuRoPA


NEC SX9 HLRS


Dell PowerEdge Ekman


Cray XT5 Rosa


Cray XE6 HECToR


 

IFS Compiler Compiler options Libraries
SGI Altix HLRB2





IBM P6 vip mpxlf90_r -qextname -q64 -qarch=pwr6 -O3 -qstrict -qutodbl=dbl4 -qfree=F90 -qsuffix=cpp=F90 -qsmp=omp -qsource -NS32648 -WF,-DRS6K -WF,-DBLAS -WF,POWER6 essl, mass massvp6
mpcc_r -O3 -qarch=pwr6 -q64 -qmaxmem=-1 -DRS6K -DINTERCEPT_ALLOC -D_ABI64 -DFORTRAN_WITH_UNDERSCORE
Cray XT4 HECToR ftn (pathscale) -O3 -mp -LIST:=ON -fullwarn -byteswapio -r8 -DBLAS -DLITTLE_ENDIAN -DLINUX acml, dl
cc -O1 -DBLAS -DLITTLE_ENDIAN -DLINUX
Cray X2 HECToR





IBM BGP Babel mpixlf90_r -O2 -qstrict -qarch=450 -qmaxmem=-1 -qextname -qsource -qautodbl=dbl4 -qfree=F90 -qsmp=omp -qsuffix=cpp=F90 -WF,-DLINUX -WF,-DBLAS -WF,-DBLUEGENE esslbg, mass, massv, fmpich_.cnk
mpixlc_r -O2 -qarch=450 -qmaxmem=-1 -DLINUX -DBLUEGENE -DINTERCEPT_ALLOC -DFORTRAN_WITH_UNDERSCORE
IBM PowerPC MareNostrum mpif90 -O3 -qstrict -q64 -qarch=ppc970 -qtune=ppc970 -qextname -qsource -qautodbl=dbl4 -qfree=F90 -qsuffix=cpp=F90 -WF,-DLINUX -WF,DBLAS essl, mass_64, massvp6_64
mpicc -O3 -q64 -qarch=ppc970 -qtune=ppc970 -DLINUX -DINTERCEPT_ALLOC -D_ABI64 -DFORTRAN_WITH_UNDERSCORE
IBM BGP JUGENE mpixlf90_r -O2 -qstrict -qarch=450 -qmaxmem=-1 -qextname -qsource -qautodbl=dbl4 -qfree=F90 -qsmp=omp -qsuffix=cpp=F90 -WF,-DLINUX -WF,-DBLAS -WF,-DBLUEGENE esslbg, mass, massv, fmpich_.cnk
mpixlc_r -O2 -qarch=450 -qmaxmem=-1 -DLINUX -DBLUEGENE -DINTERCEPT_ALLOC -DFORTRAN_WITH_UNDERSCORE
IBM SP6 CINECA mpxlf90_r -qextname -q64 -qarch=pwr6 -O3 -qstrict -qutodbl=dbl4 -qfree=F90 -qsuffix=cpp=F90 -qsmp=omp -qsource -NS32648 -WF,-DRS6K -WF,-DBLAS -WF,POWER6 essl, mass massvp6
mpcc_r -O3 -qarch=pwr6 -q64 -qmaxmem=-1 -DRS6K -DINTERCEPT_ALLOC -D_ABI64 -DFORTRAN_WITH_UNDERSCORE
Intel Nehalem JuRoPA mpif90(intel) -O3 -c -openmp -fp-model precise -convert big_endian -xhost -r8 -DBLAS -DLITTLE_ENDIAN -DLINUX mkl_intel_lp64, mkl_intel_thread,mkl_core,iomp5,pthread
mpicc(intel) -O2 -DBLAS -DLITTLE_ENDIAN -DLINUX
NEC SX9 HLRS





Dell PowerEdge Ekman mpif90(intel) -O3 -c -openmp -fp-model precise -convert big_endian -xhost -r8 -DBLAS -DLITTLE_ENDIAN -DLINUX mkl_intel_lp64, mkl_intel_thread,mkl_core,iomp5,pthread
mpicc(intel) -O2 -DBLAS -DLITTLE_ENDIAN -DLINUX
Cray XT5 Rosa ftn (pathscale) -O3 -mp -LIST:=ON -fullwarn -byteswapio -r8 -DBLAS -DLITTLE_ENDIAN -DLINUX acml, dl
cc -O1 -DBLAS -DLITTLE_ENDIAN -DLINUX
Cray XE6 HECToR ftn (pathscale) -O3 -mp -LIST:=ON -fullwarn -byteswapio -r8 -DBLAS -DLITTLE_ENDIAN -DLINUX acml, dl
cc

 

IQCS Compiler Compiler options Libraries
SGI Altix HLRB2


IBM P6 vip mpxlf90_r -O5 -q64 -qtune=pwr6 -qarch=pwr6 -bdatapsize:64k -btextpsize:64k -bstackpsize:64k
Cray XT4 HECToR pgf90/8.0.6 -fastsse -Mipa=fast -O3
Cray X2 HECToR ftn -f free -N 255 -h cpu=cray-x2 -O3 -e Z -DDYNALLOC
IBM BGP Babel mpixlf90_r -O5 -qnostrict -qarch=450 -qtune=450 -WF,-DDYNALLOC
IBM PowerPC MareNostrum mpif90 -O5
IBM BGP JUGENE mpixlf90_r -O5 -qnostrict -qarch=450 -qtune=450 -WF,-DDYNALLOC
IBM SP6 CINECA mpxlf90_r -O5 -q64 -qtune=pwr6 -qarch=pwr6 -bdatapsize:64k -btextpsize:64k -bstackpsize:64k
Intel Nehalem JuRoPA mpif90 -O3 -DDYNALLOC
NEC SX9 HLRS sxf90 -C hopt -sx9-Wf'-pvctl loopcnt=1000000' -DDYNALLOC
Dell PowerEdge Ekman mpif90 (Intel 11.1.069) -O3 -ip -xHOST -no_pred -DDYNALLOC
Cray XT5 Rosa crayftn -O3 -f free -N 255 -DDYNALLOC -lhugetlbfs
Cray XE6 HECToR pgf90/10.9-0 -fastsse -Mipa=fast -O3

 

NAMD Compiler Compiler options Libraries
SGI Altix HLRB2 mpicc + icc for namd: CXX = icpc -D_IA64 -I/usr/local/gnu/include CXXOPTS = -static-libcxa -O2 fftw

for charm++: ./build charm++ mpi-linux-ia64 -nobs -DCMK_OPTIMIZE=1
IBM P6 vip xlc_r for namd: -O4 -qinlglue -qarch=pwr6 -qtune=pwr6 fftw
mpCC_r for charm++: ./build charm++ mpi-sp -DCMK_OPTIMIZE=1
Cray XT4 HECToR pgCC for namd: CXXOPTS=-fastsse -Mipa=fast,inline COPTS=-fastsse -Mipa=fast,inline CHARMOPTS=-lgmalloc fftw
pgi/7.2-3 for charm++: ./build charm++ mpi-crayxt3 -DCMK_OPTIMIZE=1
Cray X2 HECToR





IBM BGP Babel bgxlC_r, bgxlc_r for namd: -O3 -qhot -qarch=450d -qtune=450 fftw

for charm++: ./build charm++ mpi-bluegenep xlc --no-shared -j4 -O3 -qarch=450d -DCMK_OPTIMIZE=1
IBM PowerPC MareNostrum mpicc , mpiCC for namd: -O0 -q64 -Q -DARCH_POWERPC - -qarch=ppc970 -qtune=ppc970 -qcache=auto fftw

for charm++: ./build charm++ mpi-sp xlc64 -DCMK_OPTIMIZE=1 -DCMK_64
IBM BGP JUGENE bgxlC_r, bgxlc_r for namd: -O2 -qarch=450d -qtune=450 -DFFTW_ENABLE_FLOAT -qstaticinline -DNO_SOCKET -DDUMMY_VMDSOCK -DNOHOSTNAME -DNO_CHDIR -DNO_STRSTREAM_H -DNO_GETPWUID fftw

for charm++: ./build charm++ mpi-bluegenep xlc --no-shared -j4 -O3 -qarch=450d -DCMK_OPTIMIZE=1
IBM SP6 CINECA xlC_r for namd: -O4 -qinlglue -qarch=pwr6 -qtune=pwr6 -bmaxdata:0x80000000
mpCC_r for charm++: ./build charm++ mpi-sp -DCMK_OPTIMIZE=1 -DCMK_64 -memory=os
Intel Nehalem JuRoPA





NEC SX9 HLRS





Dell PowerEdge Ekman





Cray XT5 Rosa





Cray XE6 HECToR





 

NEMO Compiler Compiler options Libraries
SGI Altix HLRB2 ifort 9.1 -O2 -r8 netcdf
IBM P6 vip mpxlf90_r -q64 -qtune=pwr6 -qarch=pwr6 -O3 -qrealsize=8 -qsave -qsuffix=cpp=F90 netcdf
Cray XT4 HECToR ftn -fastsse -r8 -O3 -Mipa -Minline -Mpreprocess
Cray X2 HECToR


IBM BGP Babel mpixlf90_r -O3 -qstrict -qtune=450 -qarch=450 netcdf
IBM PowerPC MareNostrum mpif90 -O3 -qstrict -qrealsize=8 -qsuffix=f=f90 -qsuffix=cpp=F90 -qtune=ppc970 -qarch=ppc970 -q64
IBM BGP JUGENE mpixlf90_r -O3 -qstrict -qtune=450 -qarch=450 netcdf
IBM SP6 CINECA mpxlf90_r -q64 -qtune=pwr6 -qarch=pwr6 -O3 -qrealsize=8 -qsave -qsuffix=cpp=F90 netcdf
Intel Nehalem JuRoPA ifort -O3 -fp-model precise -fp-model except -ipo -real-size 64 -axSSE4.2 netcdf
NEC SX9 HLRS sxf90 -C hopt -dW -Wf,-A idbl4 -Ep -Wf,-P nh -Wf,-pvctl loopcnt=1000000 netcdf
Dell PowerEdge Ekman ifort -O3 -fpp -fp-model precise -fp-model except -real-size 64 -axSSE4.2 netcdf
Cray XT5 Rosa ifort -O3 -fpp -fp-model precise -fp-model except -real-size 64 -axSSE4.2 netcdf
Cray XE6 HECToR


 

PEPC Compiler Compiler options Libraries
SGI Altix HLRB2


IBM P6 vip mpxlf90_r -qtune=pwr6 -qarch=pwr6 -q64 -O4 -qipa=level=1 -qipa=inline=key2addr
Cray XT4 HECToR ftn (pgi 8.0.6) -fastsse -Mipa=fast -O3 -tp barcelona-64
Cray X2 HECToR


IBM BGP Babel mpixlf90_r -qarch=450d -qtune=450 -O4 -qnostrict -qipa=level=1 -qipa=inline=key2addr
IBM PowerPC MareNostrum mpif90 -q64 -O4 -qipa=level=1 -qipa=inline=key2addr
IBM BGP JUGENE mpixlf90_r -qarch=450d -qtune=450 -O4 -qnostrict -qipa=level=1 -qipa=inline=key2addr
IBM SP6 CINECA mpxlf90_r -qtune=pwr6 -qarch=pwr6 -q64 -O4 -qipa=level=1 -qipa=inline=key2addr -bdatapsize:64k -btextpsize:64k -bstackpsize:64k
Intel Nehalem JuRoPA mpif90 -O2 -ip -ipo -axSSE4.2
NEC SX9 HLRS


Dell PowerEdge Ekman mpif90 (pgi 10.3-0) -fastsse -O3 -Mipa=fast -Mvect -Minline=name:key2addr
Cray XT5 Rosa ftn (pgi 10.8-0) -fastsse -O3 -Mipa=fast -Mvect -lhugelbfs -Minline=name:key2addr
Cray XE6 HECToR pgf90/10.9-0 -fastsse -O3 -Mipa=fast -Mipa=inline -Mvect -Minline=name:key2addr

 

QuantumESPRESSO Compiler Compiler options Libraries
SGI Altix HLRB2


IBM P6 vip


Cray XT4 HECToR


Cray X2 HECToR


IBM BGP Babel


IBM PowerPC MareNostrum


IBM BGP JUGENE


IBM SP6 CINECA


Intel Nehalem JuRoPA intel/11.1.059 -O2 -axSSE4.2 -assume byterecl mkl/10.2.2.025
NEC SX9 HLRS sx/Rev.410 -C hopt -sx9 MathKeisan/3.0
Dell PowerEdge Ekman intel/11.1 -O3 -assume byterecl mkl/11.1 openmpi/1.4.1-intel
Cray XT5 Rosa pgi/10.6.0 -fast -tp istanbul-64 xt-libsci/10.4.6, fftw/3.2.2.1
Cray XE6 HECToR pgi/10.9.0 -fast -tp istanbul-64 Xt-libsci/10.5.0, fftw/3.2.2.1

 

RAMSES Compiler Compiler options Libraries
SGI Altix HLRB2 ifort 9.1 -O3 -ipo
IBM P6 vip mpxlf90_r -O4 -qnostrict -qhot -qarch=pwr6 -qtune=pwr6
Cray XT4 HECToR ftn -O3 -Mipa -Minline -Mbyteswapio -Mpreprocess
Cray X2 HECToR


IBM BGP Babel mpixlf90_r -O4 -qstrict -qhot -qtune=450 -qarch=450d
IBM PowerPC MareNostrum mpif90 -qfree=f90 -qsuffix=f=f90 -qsuffix=cpp=f90 -O4 -qnostrict -qhot -q64 -qtune=ppc970 -qarch=ppc970
IBM BGP JUGENE mpixlf90_r -O4 -qstrict -qhot -qtune=450 -qarch=450d
IBM SP6 CINECA mpxlf90_r -O4 -qnostrict -qhot -qarch=pwr6 -qtune=pwr6
Intel Nehalem JuRoPA ifort

NEC SX9 HLRS sxf90

Dell PowerEdge Ekman ifort -O3 -fpp -ipo -axSSE4.2 -convert big_endian
Cray XT5 Rosa ifort -O3 -fpp -ipo -axSSE4.2 -convert big_endian
Cray XE6 HECToR


 

SU3_AHiggs Compiler Compiler options Libraries
SGI Altix HLRB2 intel/9.1 -O3 -ip
IBM P6 vip mpcc_r -q64 -O5
Cray XT4 HECToR pathscale/3.1 -Ofast -march=barcelona
Cray X2 HECToR


IBM BGP Babel mpixlc_r -O3 -qarch=450d -qtune=450 -qipa=level=2 mass
IBM PowerPC MareNostrum ibm/10.1 -q64 -O4 -qnohot
IBM BGP JUGENE ibm/9.0 -O5 -qarch=450d -qtune=450 mass/4.4
IBM SP6 CINECA ibm/10.1 -q64 -O5
Intel Nehalem JuRoPA gnu/4.3.2 -O3 -march=native -mtune=native
NEC SX9 HLRS


Dell PowerEdge Ekman gnu/4.3.2 -O3 -march=native -mtune=native openmpi/1.4.1-gcc
Cray XT5 Rosa gnu/4.4.4 -O3 -march=native -mtune=native
Cray XE6 HECToR pathscale/3.2.99 -Ofast -march=barcelona

 

Document Actions