Intel Cascade Lake compiler optimization benchmarks on GCC 10.
Baseline Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500012cPython Notes: Python 3.8.2Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Mitigation of TSX disabled
PGO Processor: Intel Core i9-10980XE @ 4.80GHz (18 Cores / 36 Threads), Motherboard: ASRock X299 Steel Legend (P1.30 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 32GB, Disk: Samsung SSD 970 PRO 512GB, Graphics: NVIDIA NV132 11GB, Audio: Realtek ALC1220, Monitor: ASUS MG28U, Network: Intel I219-V + Intel I211
OS: Ubuntu 20.04, Kernel: 5.4.0-29-generic (x86_64), Desktop: GNOME Shell 3.36.1, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, OpenGL: 4.3 Mesa 20.0.4, Compiler: GCC 10.1.0, File-System: ext4, Screen Resolution: 3840x2160
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Realtime PGO Baseline 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 19.18 19.08 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Two-Pass PGO Baseline 0.8325 1.665 2.4975 3.33 4.1625 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.70 3.69 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Basis Universal Basis Universal is a GPU texture codoec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S PGO Baseline 10 20 30 40 50 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 44.88 45.10 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 PGO Baseline 9 18 27 36 45 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 37.22 37.31 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Unkeyed Algorithms PGO Baseline 80 160 240 320 400 SE +/- 0.17, N = 3 SE +/- 0.32, N = 3 359.42 357.55 1. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Integer + Elliptic Curve Public Key Algorithms PGO Baseline 1200 2400 3600 4800 6000 SE +/- 5.24, N = 3 SE +/- 22.82, N = 3 5691.24 5646.59 1. (CXX) g++ options: -g2 -O3 -fPIC -pthread -pipe
dav1d Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Summer Nature 1080p PGO Baseline 120 240 360 480 600 SE +/- 3.23, N = 3 SE +/- 0.59, N = 3 554.23 559.10 MIN: 356.54 / MAX: 606.99 MIN: 375.87 / MAX: 606.19 1. (CC) gcc options: -pthread
OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Summer Nature 4K PGO Baseline 50 100 150 200 250 SE +/- 0.49, N = 3 SE +/- 0.28, N = 3 228.27 227.36 MIN: 177.26 / MAX: 251.18 MIN: 170.37 / MAX: 247.69 1. (CC) gcc options: -pthread
OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Chimera 1080p PGO Baseline 130 260 390 520 650 SE +/- 6.52, N = 3 SE +/- 0.27, N = 3 601.36 611.80 MIN: 404.59 / MAX: 750.79 MIN: 471.5 / MAX: 752.98 1. (CC) gcc options: -pthread
OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Chimera 1080p 10-bit PGO Baseline 20 40 60 80 100 SE +/- 0.48, N = 3 SE +/- 0.27, N = 3 97.43 98.26 MIN: 67.27 / MAX: 198.49 MIN: 67.46 / MAX: 205.81 1. (CC) gcc options: -pthread
Facebook RocksDB This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Facebook RocksDB 6.3.6 Test: Random Fill PGO Baseline 300K 600K 900K 1200K 1500K SE +/- 10617.89, N = 3 SE +/- 16050.51, N = 5 1405311 1361002 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread
OpenBenchmarking.org Op/s, More Is Better Facebook RocksDB 6.3.6 Test: Random Read PGO Baseline 20M 40M 60M 80M 100M SE +/- 129511.98, N = 3 SE +/- 62784.49, N = 3 101999919 102124258 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 PGO Baseline 4K 8K 12K 16K 20K SE +/- 87.35, N = 3 SE +/- 251.56, N = 3 19285 18968 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate PGO Baseline 200 400 600 800 1000 SE +/- 9.26, N = 3 SE +/- 12.50, N = 3 822 794 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.25 Backend: Random PGO Baseline 30K 60K 90K 120K 150K SE +/- 310.95, N = 3 SE +/- 137.29, N = 3 142358 140468 1. (CXX) g++ options: -pthread
OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Sequential Fill PGO Baseline 70 140 210 280 350 SE +/- 0.15, N = 3 SE +/- 0.47, N = 3 337.84 337.73 1. (CXX) g++ options: -O3 -lsnappy -lpthread
OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Read PGO Baseline 6 12 18 24 30 SE +/- 0.36, N = 4 SE +/- 0.31, N = 6 27.24 27.42 1. (CXX) g++ options: -O3 -lsnappy -lpthread
OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Hot Read PGO Baseline 7 14 21 28 35 SE +/- 0.08, N = 3 SE +/- 0.16, N = 3 27.61 27.71 1. (CXX) g++ options: -O3 -lsnappy -lpthread
OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Seek Random PGO Baseline 8 16 24 32 40 SE +/- 0.14, N = 3 SE +/- 0.18, N = 3 32.99 33.53 1. (CXX) g++ options: -O3 -lsnappy -lpthread
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 32 PGO Baseline 110 220 330 440 550 SE +/- 0.89, N = 3 SE +/- 0.27, N = 3 494 483 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -llzma -lbz2 -lsnappy -laio -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl
NGINX Benchmark This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better NGINX Benchmark 1.9.9 Static Web Page Serving PGO Baseline 11K 22K 33K 44K 55K SE +/- 210.12, N = 3 SE +/- 110.28, N = 3 51875.82 51257.90 1. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 12.0 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write PGO Baseline 2K 4K 6K 8K 10K SE +/- 101.31, N = 3 SE +/- 53.12, N = 3 9684.57 9701.43 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time PGO Baseline 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 27.60 27.54 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lSM -lICE -lX11 -lIlmImf -lImath -lHalf -lIex -lIexMath -lIlmThread -lpthread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
Stockfish This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time PGO Baseline 11M 22M 33M 44M 55M SE +/- 511603.78, N = 3 SE +/- 274511.80, N = 3 51588693 51086797 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++11 -pedantic -O3 -msse -msse3 -mpopcnt -flto
Tungsten Renderer Tungsten is a C++ physically based renderer that makes use of Intel's Embree ray tracing library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Hair PGO Baseline 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 14.45 14.49 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Water Caustic PGO Baseline 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 21.04 21.11 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Non-Exponential PGO Baseline 2 4 6 8 10 SE +/- 0.05126, N = 3 SE +/- 0.07592, N = 3 6.05293 6.15057 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Volumetric Caustic PGO Baseline 2 4 6 8 10 SE +/- 0.02011, N = 3 SE +/- 0.06704, N = 3 7.31794 7.33237 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
YafaRay YafaRay is an open-source physically based montecarlo ray-tracing engine. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better YafaRay 3.4.1 Total Time For Sample Scene PGO Baseline 30 60 90 120 150 SE +/- 1.40, N = 4 SE +/- 1.41, N = 5 103.52 111.72 1. (CXX) g++ options: -std=c++11 -O3 -ffast-math -rdynamic -ldl -lImath -lIlmImf -lIex -lHalf -lz -lIlmThread -lxml2 -lfreetype -lpthread
Baseline Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500012cPython Notes: Python 3.8.2Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Mitigation of TSX disabled
Testing initiated at 23 May 2020 07:13 by user pts.
PGO Processor: Intel Core i9-10980XE @ 4.80GHz (18 Cores / 36 Threads), Motherboard: ASRock X299 Steel Legend (P1.30 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 32GB, Disk: Samsung SSD 970 PRO 512GB, Graphics: NVIDIA NV132 11GB, Audio: Realtek ALC1220, Monitor: ASUS MG28U, Network: Intel I219-V + Intel I211
OS: Ubuntu 20.04, Kernel: 5.4.0-29-generic (x86_64), Desktop: GNOME Shell 3.36.1, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, OpenGL: 4.3 Mesa 20.0.4, Compiler: GCC 10.1.0, File-System: ext4, Screen Resolution: 3840x2160
Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500012cPython Notes: Python 3.8.2Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Mitigation of TSX disabled
Testing initiated at 23 May 2020 12:45 by user pts.