Intel Cascade Lake compiler optimization benchmarks on GCC 10.
Baseline Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500012cPython Notes: Python 3.8.2Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Mitigation of TSX disabled
PGO Processor: Intel Core i9-10980XE @ 4.80GHz (18 Cores / 36 Threads), Motherboard: ASRock X299 Steel Legend (P1.30 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 32GB, Disk: Samsung SSD 970 PRO 512GB, Graphics: NVIDIA NV132 11GB, Audio: Realtek ALC1220, Monitor: ASUS MG28U, Network: Intel I219-V + Intel I211
OS: Ubuntu 20.04, Kernel: 5.4.0-29-generic (x86_64), Desktop: GNOME Shell 3.36.1, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, OpenGL: 4.3 Mesa 20.0.4, Compiler: GCC 10.1.0, File-System: ext4, Screen Resolution: 3840x2160
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Unkeyed Algorithms Baseline PGO 80 160 240 320 400 SE +/- 0.32, N = 3 SE +/- 0.17, N = 3 357.55 359.42 1. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Integer + Elliptic Curve Public Key Algorithms Baseline PGO 1200 2400 3600 4800 6000 SE +/- 22.82, N = 3 SE +/- 5.24, N = 3 5646.59 5691.24 1. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe
Basis Universal Basis Universal is a GPU texture codoec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S Baseline PGO 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 45.10 44.88 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 Baseline PGO 9 18 27 36 45 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 37.31 37.22 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Baseline PGO 4K 8K 12K 16K 20K SE +/- 251.56, N = 3 SE +/- 87.35, N = 3 18968 19285 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.25 Backend: Random Baseline PGO 30K 60K 90K 120K 150K SE +/- 137.29, N = 3 SE +/- 310.95, N = 3 140468 142358 1. (CXX) g++ options: -pthread
Stockfish This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time Baseline PGO 11M 22M 33M 44M 55M SE +/- 274511.80, N = 3 SE +/- 511603.78, N = 3 51086797 51588693 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++11 -pedantic -O3 -msse -msse3 -mpopcnt -flto
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Realtime Baseline PGO 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 19.08 19.18 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Two-Pass Baseline PGO 0.8325 1.665 2.4975 3.33 4.1625 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 3.69 3.70 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Tungsten Renderer Tungsten is a C++ physically based renderer that makes use of Intel's Embree ray tracing library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Hair Baseline PGO 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 14.49 14.45 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Water Caustic Baseline PGO 5 10 15 20 25 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 21.11 21.04 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Non-Exponential Baseline PGO 2 4 6 8 10 SE +/- 0.07592, N = 3 SE +/- 0.05126, N = 3 6.15057 6.05293 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Volumetric Caustic Baseline PGO 2 4 6 8 10 SE +/- 0.06704, N = 3 SE +/- 0.02011, N = 3 7.33237 7.31794 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate Baseline PGO 200 400 600 800 1000 SE +/- 12.50, N = 3 SE +/- 9.26, N = 3 794 822 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
dav1d Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Summer Nature 1080p PGO Baseline 120 240 360 480 600 SE +/- 3.23, N = 3 SE +/- 0.59, N = 3 554.23 559.10 MIN: 356.54 / MAX: 606.99 MIN: 375.87 / MAX: 606.19 1. (CC) gcc options: -pthread
OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Summer Nature 4K Baseline PGO 50 100 150 200 250 SE +/- 0.28, N = 3 SE +/- 0.49, N = 3 227.36 228.27 MIN: 170.37 / MAX: 247.69 MIN: 177.26 / MAX: 251.18 1. (CC) gcc options: -pthread
OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Chimera 1080p PGO Baseline 130 260 390 520 650 SE +/- 6.52, N = 3 SE +/- 0.27, N = 3 601.36 611.80 MIN: 404.59 / MAX: 750.79 MIN: 471.5 / MAX: 752.98 1. (CC) gcc options: -pthread
OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Chimera 1080p 10-bit PGO Baseline 20 40 60 80 100 SE +/- 0.48, N = 3 SE +/- 0.27, N = 3 97.43 98.26 MIN: 67.27 / MAX: 198.49 MIN: 67.46 / MAX: 205.81 1. (CC) gcc options: -pthread
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time PGO Baseline 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 27.60 27.54 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lSM -lICE -lX11 -lIlmImf -lImath -lHalf -lIex -lIexMath -lIlmThread -lpthread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
YafaRay YafaRay is an open-source physically based montecarlo ray-tracing engine. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better YafaRay 3.4.1 Total Time For Sample Scene Baseline PGO 30 60 90 120 150 SE +/- 1.41, N = 5 SE +/- 1.40, N = 4 111.72 103.52 1. (CXX) g++ options: -std=c++11 -O3 -ffast-math -rdynamic -ldl -lImath -lIlmImf -lIex -lHalf -lz -lIlmThread -lxml2 -lfreetype -lpthread
NGINX Benchmark This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better NGINX Benchmark 1.9.9 Static Web Page Serving Baseline PGO 11K 22K 33K 44K 55K SE +/- 110.28, N = 3 SE +/- 210.12, N = 3 51257.90 51875.82 1. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native
Facebook RocksDB This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Facebook RocksDB 6.3.6 Test: Random Fill Baseline PGO 300K 600K 900K 1200K 1500K SE +/- 16050.51, N = 5 SE +/- 10617.89, N = 3 1361002 1405311 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread
OpenBenchmarking.org Op/s, More Is Better Facebook RocksDB 6.3.6 Test: Random Read PGO Baseline 20M 40M 60M 80M 100M SE +/- 129511.98, N = 3 SE +/- 62784.49, N = 3 101999919 102124258 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread
OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Sequential Fill PGO Baseline 70 140 210 280 350 SE +/- 0.15, N = 3 SE +/- 0.47, N = 3 337.84 337.73 1. (CXX) g++ options: -O3 -lsnappy -lpthread
OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Read Baseline PGO 6 12 18 24 30 SE +/- 0.31, N = 6 SE +/- 0.36, N = 4 27.42 27.24 1. (CXX) g++ options: -O3 -lsnappy -lpthread
OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Hot Read Baseline PGO 7 14 21 28 35 SE +/- 0.16, N = 3 SE +/- 0.08, N = 3 27.71 27.61 1. (CXX) g++ options: -O3 -lsnappy -lpthread
OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Seek Random Baseline PGO 8 16 24 32 40 SE +/- 0.18, N = 3 SE +/- 0.14, N = 3 33.53 32.99 1. (CXX) g++ options: -O3 -lsnappy -lpthread
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 12.0 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write PGO Baseline 2K 4K 6K 8K 10K SE +/- 101.31, N = 3 SE +/- 53.12, N = 3 9684.57 9701.43 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 32 Baseline PGO 110 220 330 440 550 SE +/- 0.27, N = 3 SE +/- 0.89, N = 3 483 494 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -llzma -lbz2 -lsnappy -laio -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl
Baseline Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500012cPython Notes: Python 3.8.2Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Mitigation of TSX disabled
Testing initiated at 23 May 2020 07:13 by user pts.
PGO Processor: Intel Core i9-10980XE @ 4.80GHz (18 Cores / 36 Threads), Motherboard: ASRock X299 Steel Legend (P1.30 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 32GB, Disk: Samsung SSD 970 PRO 512GB, Graphics: NVIDIA NV132 11GB, Audio: Realtek ALC1220, Monitor: ASUS MG28U, Network: Intel I219-V + Intel I211
OS: Ubuntu 20.04, Kernel: 5.4.0-29-generic (x86_64), Desktop: GNOME Shell 3.36.1, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, OpenGL: 4.3 Mesa 20.0.4, Compiler: GCC 10.1.0, File-System: ext4, Screen Resolution: 3840x2160
Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500012cPython Notes: Python 3.8.2Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Mitigation of TSX disabled
Testing initiated at 23 May 2020 12:45 by user pts.