AMD EPYC 7F72 2P Linux 5.11 2 x AMD EPYC 7F72 24-Core testing looking at CPU freq invariance on 5.11 with patch. CPU power consumption monitoring via AMD_Energy interface at 1 second polling.
HTML result view exported from: https://openbenchmarking.org/result/2101248-HA-AMDEPYC7F52&hlc=1&hnr=1&hlc=1&gru&rdt .
AMD EPYC 7F72 2P Linux 5.11 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 2 x AMD EPYC 7F72 24-Core @ 3.20GHz (48 Cores / 96 Threads) Supermicro H11DSi-NT v2.00 (2.1 BIOS) AMD Starship/Matisse 16 x 8192 MB DDR4-3200MT/s HMA81GR7CJR8N-XN 1000GB Western Digital WD_BLACK SN850 1TB ASPEED VE228 2 x Intel 10G X550T Ubuntu 20.10 5.10.9-051009-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 modesetting 1.20.9 GCC 10.2.0 ext4 1920x1080 5.11.0-051100rc4daily20210122-generic (x86_64) 20210121 VE228 5.11.0-rc4-max-boost-inv-patch (x86_64) 20210121 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8301034 Java Details - OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10) Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
AMD EPYC 7F72 2P Linux 5.11 blogbench: Read ospray: San Miguel - SciVis plaidml: No - Inference - VGG19 - CPU ttsiod-renderer: Phong Rendering With Soft-Shadow Mapping rav1e: 10 rav1e: 6 rav1e: 5 rav1e: 1 svt-av1: Enc Mode 0 - 1080p svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p svt-vp9: VMAF Optimized - Bosphorus 1080p x265: Bosphorus 1080p x265: Bosphorus 4K onnx: yolov4 - OpenMP CPU cpuminer-opt: LBC, LBRY Credits ior: 2MB - Default Test Directory ffte: N=256, 3D Complex FFT Routine fftw: Float + SSE - 2D FFT Size 4096 lczero: Eigen lammps: Rhodopsin Protein keydb: john-the-ripper: MD5 redis: SET redis: SADD ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score npb: LU.C brl-cad: VGR Performance Metric lulesh: tensorflow-lite: SqueezeNet tensorflow-lite: Inception ResNet V2 tensorflow-lite: Inception V4 financebench: Repo OpenMP tnn: CPU - MobileNet v2 onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU dacapobench: Jython dacapobench: Tradebeans build-godot: Time To Compile build-gdb: Time To Compile qe: AUSURF112 rodinia: OpenMP Leukocyte Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 981673 52.63 21.63 726.865 2.837 1.346 1.018 0.351 0.094 365.96 357.42 48.48 19.26 182 135602 517.13 182254.44246531 17440 4379 23.573 280163.95 4780667 1429370.71 1563597.66 1593 1028 2621 152840.60 615930 18424.984 61294.2 755450 835208 41287.102865 291.428 0.872441 2.32899 1.56970 0.888428 0.519015 4951 6032 61.440 102.498 1197.60 54.965 1084405 52.63 22.09 627.208 2.902 1.370 1.045 0.368 0.092 381.08 369.01 47.66 18.63 175 132477 505.19 174206.13000387 18468 4284 21.129 302893.56 4550333 1380890.22 1539146.21 1697 1059 2756 147443.86 638971 19576.122 65193.0 765726 894640 40124.373698 303.449 0.914198 2.40549 1.62088 0.881348 0.547674 4897 5954 60.858 97.641 1217.49 53.862 1103118 54.97 22.49 655.225 3.054 1.408 1.068 0.372 0.091 371.48 364.81 49.45 19.74 181 139037 475.25 178738.12497094 17015 4433 23.787 294214.37 4612308 1427348.10 1611164.34 1720 1067 2787 154376.76 636521 19771.223 62195.4 736285 810750 39406.757812 289.764 0.863782 2.33290 1.55447 0.849248 0.521968 4778 5591 59.177 92.916 1171.03 52.684 OpenBenchmarking.org
BlogBench Test: Read OpenBenchmarking.org Final Score, More Is Better BlogBench 1.1 Test: Read Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 200K 400K 600K 800K 1000K SE +/- 4087.26, N = 3 SE +/- 10984.18, N = 9 SE +/- 1738.41, N = 3 981673 1084405 1103118 1. (CC) gcc options: -O2 -pthread
OSPray Demo: San Miguel - Renderer: SciVis OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: San Miguel - Renderer: SciVis Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 12 24 36 48 60 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.58, N = 5 52.63 52.63 54.97 MIN: 24.39 / MAX: 58.82 MIN: 27.03 / MAX: 58.82 MIN: 31.25 / MAX: 58.82
PlaidML FP16: No - Mode: Inference - Network: VGG19 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: VGG19 - Device: CPU Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 5 10 15 20 25 SE +/- 0.23, N = 15 SE +/- 0.20, N = 15 SE +/- 0.16, N = 15 21.63 22.09 22.49
TTSIOD 3D Renderer Phong Rendering With Soft-Shadow Mapping OpenBenchmarking.org FPS, More Is Better TTSIOD 3D Renderer 2.3b Phong Rendering With Soft-Shadow Mapping Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 160 320 480 640 800 SE +/- 10.33, N = 3 SE +/- 9.04, N = 15 SE +/- 3.22, N = 3 726.87 627.21 655.23 1. (CXX) g++ options: -O3 -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -fopenmp -fwhole-program -lstdc++
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.6872 1.3744 2.0616 2.7488 3.436 SE +/- 0.024, N = 3 SE +/- 0.016, N = 3 SE +/- 0.008, N = 3 2.837 2.902 3.054
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.3168 0.6336 0.9504 1.2672 1.584 SE +/- 0.005, N = 3 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 1.346 1.370 1.408
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.2403 0.4806 0.7209 0.9612 1.2015 SE +/- 0.006, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 1.018 1.045 1.068
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 1 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.0837 0.1674 0.2511 0.3348 0.4185 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 0.351 0.368 0.372
SVT-AV1 Encoder Mode: Enc Mode 0 - Input: 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 0 - Input: 1080p Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.0212 0.0424 0.0636 0.0848 0.106 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 12 0.094 0.092 0.091 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.1 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 80 160 240 320 400 SE +/- 2.11, N = 10 SE +/- 2.00, N = 10 SE +/- 1.70, N = 9 365.96 381.08 371.48 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.1 Tuning: VMAF Optimized - Input: Bosphorus 1080p Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 80 160 240 320 400 SE +/- 1.89, N = 10 SE +/- 1.11, N = 10 SE +/- 0.91, N = 10 357.42 369.01 364.81 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 11 22 33 44 55 SE +/- 0.26, N = 4 SE +/- 0.42, N = 7 SE +/- 0.52, N = 4 48.48 47.66 49.45 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 5 10 15 20 25 SE +/- 0.13, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 19.26 18.63 19.74 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 40 80 120 160 200 SE +/- 1.89, N = 12 SE +/- 1.60, N = 12 SE +/- 1.86, N = 3 182 175 181 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
Cpuminer-Opt Algorithm: LBC, LBRY Credits OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: LBC, LBRY Credits Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 30K 60K 90K 120K 150K SE +/- 1088.59, N = 15 SE +/- 1036.73, N = 3 SE +/- 1380.06, N = 3 135602 132477 139037 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
IOR Block Size: 2MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 110 220 330 440 550 SE +/- 1.76, N = 3 SE +/- 1.77, N = 3 SE +/- 2.06, N = 3 517.13 505.19 475.25 MIN: 453.79 / MAX: 894.75 MIN: 457.62 / MAX: 951.11 MIN: 400.96 / MAX: 971.55 1. (CC) gcc options: -O2 -lm -pthread -lmpi
FFTE N=256, 3D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 7.0 N=256, 3D Complex FFT Routine Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 40K 80K 120K 160K 200K SE +/- 1647.45, N = 15 SE +/- 1640.30, N = 15 SE +/- 1760.31, N = 15 182254.44 174206.13 178738.12 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 4K 8K 12K 16K 20K SE +/- 280.66, N = 6 SE +/- 24.98, N = 3 SE +/- 213.45, N = 3 17440 18468 17015 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: Eigen Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 1000 2000 3000 4000 5000 SE +/- 32.10, N = 3 SE +/- 49.20, N = 4 SE +/- 36.23, N = 3 4379 4284 4433 1. (CXX) g++ options: -flto -pthread
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 6 12 18 24 30 SE +/- 0.23, N = 15 SE +/- 0.23, N = 15 SE +/- 0.17, N = 12 23.57 21.13 23.79 1. (CXX) g++ options: -O3 -pthread -lm
KeyDB OpenBenchmarking.org Ops/sec, More Is Better KeyDB 6.0.16 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 60K 120K 180K 240K 300K SE +/- 3843.85, N = 3 SE +/- 4239.68, N = 15 SE +/- 3012.50, N = 15 280163.95 302893.56 294214.37 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.9.0-jumbo-1 Test: MD5 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 1000K 2000K 3000K 4000K 5000K SE +/- 8171.77, N = 3 SE +/- 49184.46, N = 3 SE +/- 54344.04, N = 13 4780667 4550333 4612308 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt -lbz2
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 300K 600K 900K 1200K 1500K SE +/- 13292.58, N = 7 SE +/- 10410.66, N = 15 SE +/- 13176.39, N = 15 1429370.71 1380890.22 1427348.10 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 300K 600K 900K 1200K 1500K SE +/- 16159.15, N = 3 SE +/- 16361.41, N = 3 SE +/- 15585.71, N = 4 1563597.66 1539146.21 1611164.34 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 400 800 1200 1600 2000 1593 1697 1720
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 200 400 600 800 1000 1028 1059 1067
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 600 1200 1800 2400 3000 2621 2756 2787
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 30K 60K 90K 120K 150K SE +/- 547.25, N = 4 SE +/- 1780.52, N = 15 SE +/- 509.59, N = 4 152840.60 147443.86 154376.76 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 140K 280K 420K 560K 700K 615930 638971 636521 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 4K 8K 12K 16K 20K SE +/- 149.06, N = 5 SE +/- 67.78, N = 5 SE +/- 171.84, N = 5 18424.98 19576.12 19771.22 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 14K 28K 42K 56K 70K SE +/- 72.63, N = 3 SE +/- 690.93, N = 3 SE +/- 412.91, N = 15 61294.2 65193.0 62195.4
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 160K 320K 480K 640K 800K SE +/- 1447.51, N = 3 SE +/- 4257.59, N = 3 SE +/- 5824.36, N = 9 755450 765726 736285
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 200K 400K 600K 800K 1000K SE +/- 4174.26, N = 3 SE +/- 2435.29, N = 3 SE +/- 1163.43, N = 3 835208 894640 810750
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 9K 18K 27K 36K 45K SE +/- 215.27, N = 3 SE +/- 319.03, N = 3 SE +/- 393.10, N = 3 41287.10 40124.37 39406.76 1. (CXX) g++ options: -O3 -march=native -fopenmp
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 70 140 210 280 350 SE +/- 0.66, N = 3 SE +/- 3.80, N = 3 SE +/- 2.83, N = 3 291.43 303.45 289.76 MIN: 283.33 / MAX: 459.62 MIN: 284.51 / MAX: 461.21 MIN: 283.65 / MAX: 458.79 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.2057 0.4114 0.6171 0.8228 1.0285 SE +/- 0.003739, N = 7 SE +/- 0.006064, N = 7 SE +/- 0.001510, N = 7 0.872441 0.914198 0.863782 MIN: 0.79 MIN: 0.78 MIN: 0.79 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.5412 1.0824 1.6236 2.1648 2.706 SE +/- 0.02538, N = 3 SE +/- 0.03372, N = 3 SE +/- 0.01587, N = 3 2.32899 2.40549 2.33290 MIN: 1.93 MIN: 1.92 MIN: 2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.3647 0.7294 1.0941 1.4588 1.8235 SE +/- 0.00943, N = 4 SE +/- 0.01518, N = 4 SE +/- 0.01340, N = 4 1.56970 1.62088 1.55447 MIN: 1.31 MIN: 1.31 MIN: 1.29 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.1999 0.3998 0.5997 0.7996 0.9995 SE +/- 0.002999, N = 5 SE +/- 0.005127, N = 5 SE +/- 0.004000, N = 5 0.888428 0.881348 0.849248 MIN: 0.77 MIN: 0.71 MIN: 0.73 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 0.1232 0.2464 0.3696 0.4928 0.616 SE +/- 0.004888, N = 4 SE +/- 0.005010, N = 4 SE +/- 0.004601, N = 4 0.519015 0.547674 0.521968 MIN: 0.43 MIN: 0.43 MIN: 0.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
DaCapo Benchmark Java Test: Jython OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: Jython Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 1100 2200 3300 4400 5500 SE +/- 47.90, N = 6 SE +/- 28.66, N = 18 SE +/- 43.93, N = 6 4951 4897 4778
DaCapo Benchmark Java Test: Tradebeans OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: Tradebeans Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 1300 2600 3900 5200 6500 SE +/- 65.72, N = 4 SE +/- 50.83, N = 20 SE +/- 66.39, N = 20 6032 5954 5591
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 14 28 42 56 70 SE +/- 0.28, N = 3 SE +/- 0.10, N = 3 SE +/- 0.17, N = 3 61.44 60.86 59.18
Timed GDB GNU Debugger Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed GDB GNU Debugger Compilation 9.1 Time To Compile Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 20 40 60 80 100 SE +/- 0.64, N = 3 SE +/- 0.40, N = 3 SE +/- 0.43, N = 3 102.50 97.64 92.92
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 300 600 900 1200 1500 SE +/- 17.89, N = 9 SE +/- 11.28, N = 3 SE +/- 12.21, N = 4 1197.60 1217.49 1171.03 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
Rodinia Test: OpenMP Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Leukocyte Linux 5.10 Linux 5.11 Git Linux 5.11 Patched 12 24 36 48 60 SE +/- 0.36, N = 15 SE +/- 0.35, N = 3 SE +/- 0.69, N = 3 54.97 53.86 52.68 1. (CXX) g++ options: -O2 -lOpenCL
Phoronix Test Suite v10.8.4