Debian Linux GCC 8 Benchmark -mindirect-branch=thunk

GCC 8 benchmarking of user-space with -mindirect-branch=thunk and -mindirect-branch=thunk-inline for retpolines. Tests by Michael Larabel for a future article on Phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1801316-AL-1801161PT71.

Debian Linux GCC 8 Benchmark -mindirect-branch=thunkProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionDisplay ServerVulkanStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-testsIntel Core i9-7980XE @ 4.40GHz (18 Cores / 36 Threads)ASUS PRIME X299-A (1004 BIOS)Intel Device 202016384MB120GB Force MP500LLVMpipeRealtek ALC1220Acer B286HKIntel ConnectionDebian 9.34.15.0-rc8-retpo-underflow (x86_64) 20180115GNOME Shell 3.22.3modesetting 1.19.23.3 Mesa 13.0.6 Gallium 0.4 (LLVM 3.9 256 bits)GCC 8.0.1 20180115ext43840x2160AMD Ryzen 7 1800X Eight-Core @ 3.70GHz (8 Cores / 16 Threads)ASRock X370 Professional GamingAMD Family 17h64512MB500GB Samsung SSD 850 + 2000GB Western Digital WD2003FZEX-0llvmpipe 6144MB (139/405MHz)NVIDIA GP106 HD AudioHP ZR2440wAquantia Device d108 + Intel Device 24fbopenSUSE 201801294.14.15-1-default (x86_64)X Server 1.19.6NVIDIA 384.1113.3 Mesa 17.3.3 (LLVM 5.0 128 bits)1.0.65GCC 7.3.0 + Clang 5.0.1 (SVN 312548) + LLVM 5.0.1 + ICC + CUDA 8.0btrfs3840x1200OpenBenchmarking.orgEnvironment Details- Stock: CXXFLAGS=-O3-march=native CFLAGS=-O3-march=native- -mindirect-branch=thunk: CXXFLAGS=-O3-march=native-mindirect-branch=thunk CFLAGS=-O3-march=native-mindirect-branch=thunk- -mindirect-branch=thunk-inline: CXXFLAGS=-O3-march=native-mindirect-branch=thunk-inline CFLAGS=-O3-march=native-mindirect-branch=thunk-inlineCompiler Details- Stock: --disable-multilib --enable-checking=release- -mindirect-branch=thunk: --disable-multilib --enable-checking=release- -mindirect-branch=thunk-inline: --disable-multilib --enable-checking=release- spectre-tests: --build=x86_64-suse-linux --disable-libcc1 --disable-libssp --disable-libstdcxx-pch --disable-libvtv --disable-werror --enable-__cxa_atexit --enable-checking=release --enable-gnu-indirect-function --enable-languages=c,c++,objc,fortran,obj-c++,ada,go --enable-libstdcxx-allocator=new --enable-linux-futex --enable-multilib --enable-offload-targets=hsa,nvptx-none=/usr/nvptx-none, --enable-plugin --enable-ssp --enable-version-specific-runtime-libs --host=x86_64-suse-linux --mandir=/usr/share/man --with-arch-32=x86-64 --with-gcc-major-version-only --with-slibdir=/lib64 --with-tune=generic --without-cuda-driver --without-system-libunwind Disk Details- Stock, -mindirect-branch=thunk: NONE / data=ordered,errors=remount-ro,relatime,rwProcessor Details- Stock: Scaling Governor: intel_pstate powersave- -mindirect-branch=thunk: Scaling Governor: intel_pstate powersave- -mindirect-branch=thunk-inline: Scaling Governor: intel_pstate powersave- spectre-tests: Scaling Governor: acpi-cpufreq ondemandPython Details- Stock, -mindirect-branch=thunk: Python 2.7.13 + Python 3.5.3Security Details- Stock, -mindirect-branch=thunk, -mindirect-branch=thunk-inline: KPTI Full retpoline with underflow protection ProtectionSystem Details- spectre-tests: Anisotropic Filtering: 16x.

Debian Linux GCC 8 Benchmark -mindirect-branch=thunkmpcbench: Multi-Precision Benchmarkhpcc: G-HPLhpcc: G-Fftehpcg: tscp: AI Chess Performancestockfish: Total Timebullet: Raytestsbullet: 3000 Fallbullet: 1000 Stackbullet: 1000 Convexbullet: Prim Trimeshbullet: Convex Trimeshffmpeg: H.264 HD To NTSC DVpgbench: Buffer Test - Normal Load - Read Writepgbench: Buffer Test - Heavy Contention - Read Writeredis: GETredis: SETStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests1001385.930905.884751.38138679429042.533.884.444.370.921.0813.2911387.0911290.862222262.581399046.62964386.046205.585641.35118535730742.784.114.574.651.001.2313.8911104.8611147.892187123.331280528.47983085.975475.560221.38111642132282.896.534.834.931.031.2814.1311460.4310540.602160702.131457026.9275571.01106822834544.254.854.510.891.112455.923070.482297157.921676110.71OpenBenchmarking.org

GNU MPC

Multi-Precision Benchmark

OpenBenchmarking.orgGlobal Score, More Is BetterGNU MPC 1.1.0Multi-Precision BenchmarkStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests2K4K6K8K10KSE +/- 43.72, N = 3SE +/- 84.52, N = 3SE +/- 75.72, N = 3SE +/- 8.82, N = 310013964398307557-lm -O3 -march=native-lm -O3 -march=native-lm -O3 -march=native-O2 -pedantic -fomit-frame-pointer -m64 -mtune=k8 -march=k81. (CC) gcc options: -MT -MD -MP -MF

HPC Challenge

Test / Class: G-HPL

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPLStock-mindirect-branch=thunk-mindirect-branch=thunk-inline20406080100SE +/- 0.29, N = 3SE +/- 0.10, N = 3SE +/- 0.17, N = 385.9386.0585.981. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. BLAS + Open MPI 2.0.2

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteStock-mindirect-branch=thunk-mindirect-branch=thunk-inline1.32412.64823.97235.29646.6205SE +/- 0.18843, N = 3SE +/- 0.01968, N = 3SE +/- 0.02991, N = 35.884755.585645.560221. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. BLAS + Open MPI 2.0.2

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteStock-mindirect-branch=thunk-mindirect-branch=thunk-inline1.32412.64823.97235.29646.6205SE +/- 0.18843, N = 3SE +/- 0.01968, N = 3SE +/- 0.02991, N = 35.884755.585645.560221. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. BLAS + Open MPI 2.0.2

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.0Stock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests0.31050.6210.93151.2421.5525SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 31.381.351.381.01

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests300K600K900K1200K1500KSE +/- 18404.85, N = 6SE +/- 10473.03, N = 5SE +/- 17216.49, N = 5SE +/- 507.53, N = 513867941185357111642110682281. (CC) gcc options: -O3 -march=native

Stockfish

Total Time

OpenBenchmarking.orgms, Fewer Is BetterStockfish 2014-11-26Total TimeStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests7001400210028003500SE +/- 6.11, N = 3SE +/- 44.96, N = 3SE +/- 56.05, N = 3SE +/- 6.06, N = 32904307432283454-march=native-march=native-march=native1. (CXX) g++ options: -lpthread -O3 -fno-exceptions -fno-rtti -ansi -pedantic -msse -msse3 -mpopcnt -flto

Bullet Physics Engine

Test: Raytests

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: RaytestsStock-mindirect-branch=thunk-mindirect-branch=thunk-inline0.65031.30061.95092.60123.2515SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.01, N = 32.532.782.891. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 3000 Fall

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 3000 FallStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests246810SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 2.39, N = 3SE +/- 0.07, N = 33.884.116.534.25-march=native-march=native-march=native1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 1000 Stack

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 StackStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests1.09132.18263.27394.36525.4565SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 3SE +/- 0.04, N = 34.444.574.834.85-march=native-march=native-march=native1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 1000 Convex

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 ConvexStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests1.10932.21863.32794.43725.5465SE +/- 0.17, N = 3SE +/- 0.11, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 34.374.654.934.51-march=native-march=native-march=native1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: Prim Trimesh

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Prim TrimeshStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests0.23180.46360.69540.92721.159SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 30.921.001.030.89-march=native-march=native-march=native1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: Convex Trimesh

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Convex TrimeshStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests0.2880.5760.8641.1521.44SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 31.081.231.281.11-march=native-march=native-march=native1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU

FFmpeg

H.264 HD To NTSC DV

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 3.3.3H.264 HD To NTSC DVStock-mindirect-branch=thunk-mindirect-branch=thunk-inline48121620SE +/- 0.31, N = 6SE +/- 0.30, N = 6SE +/- 0.32, N = 613.2913.8914.131. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lxcb -lxcb-shm -lxcb-xfixes -lxcb-shape -lasound -lm -llzma -lbz2 -pthread -O3 -march=native -std=c11 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 10.0Scaling: Buffer Test - Test: Normal Load - Mode: Read WriteStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests2K4K6K8K10KSE +/- 221.41, N = 6SE +/- 276.43, N = 6SE +/- 63.39, N = 3SE +/- 41.62, N = 311387.0911104.8611460.432455.92-O3 -march=native-O3 -march=native-O3 -march=native-O21. (CC) gcc options: -fno-strict-aliasing -fwrapv -fPIC -lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 10.0Scaling: Buffer Test - Test: Heavy Contention - Mode: Read WriteStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests2K4K6K8K10KSE +/- 264.90, N = 6SE +/- 326.63, N = 6SE +/- 43.65, N = 3SE +/- 43.25, N = 611290.8611147.8910540.603070.48-O3 -march=native-O3 -march=native-O3 -march=native-O21. (CC) gcc options: -fno-strict-aliasing -fwrapv -fPIC -lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests500K1000K1500K2000K2500KSE +/- 43689.11, N = 3SE +/- 40180.95, N = 6SE +/- 41655.36, N = 6SE +/- 8828.42, N = 32222262.582187123.332160702.132297157.921. (CC) gcc options: -ggdb -rdynamic -lm -pthread

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: SETStock-mindirect-branch=thunk-mindirect-branch=thunk-inlinespectre-tests400K800K1200K1600K2000KSE +/- 34299.85, N = 6SE +/- 160590.96, N = 6SE +/- 2554.75, N = 3SE +/- 10561.86, N = 31399046.621280528.471457026.921676110.711. (CC) gcc options: -ggdb -rdynamic -lm -pthread


Phoronix Test Suite v10.8.4