AMD EPYC 72F3 8-Core testing with a Supermicro H12SSL-i v1.01 (2.0 BIOS) and ASPEED on Ubuntu 21.04 via the Phoronix Test Suite.
AOCC 3.2 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3Processor Notes: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.9.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
AMD AOCC 3.2 Processor: AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 21.04, Kernel: 5.14.0-rc7-amd-pstate-phx (x86_64) 20210909, Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 13.0.0, File-System: ext4, Screen Resolution: 1920x1080
dav1d Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better dav1d 0.9.2 Video Input: Summer Nature 4K AMD AOCC 3.2 AOCC 3.2 40 80 120 160 200 SE +/- 0.14, N = 3 SE +/- 0.23, N = 3 200.33 199.98 MIN: 188.52 / MAX: 227.27 MIN: 188.58 / MAX: 226.52 1. (CC) gcc options: -O3 -march=native -pthread -lm
OpenBenchmarking.org FPS, More Is Better dav1d 0.9.2 Video Input: Chimera 1080p 10-bit AMD AOCC 3.2 AOCC 3.2 100 200 300 400 500 SE +/- 1.84, N = 3 SE +/- 1.57, N = 3 454.48 452.42 MIN: 362.96 / MAX: 709.88 MIN: 362.02 / MAX: 691.34 1. (CC) gcc options: -O3 -march=native -pthread -lm
OpenBenchmarking.org FPS, More Is Better dav1d 0.9.2 Video Input: Chimera 1080p AMD AOCC 3.2 120 240 360 480 600 SE +/- 0.97, N = 3 538.84 MIN: 431.65 / MAX: 824.69 1. (CC) gcc options: -O3 -march=native -pthread -lm
OpenBenchmarking.org FPS, More Is Better dav1d 0.9.2 Video Input: Summer Nature 1080p AMD AOCC 3.2 110 220 330 440 550 SE +/- 0.93, N = 3 504.53 MIN: 458.56 / MAX: 543.03 1. (CC) gcc options: -O3 -march=native -pthread -lm
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium AMD AOCC 3.2 AOCC 3.2 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.37 8.36 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast AMD AOCC 3.2 AOCC 3.2 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 19.17 19.17 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast AMD AOCC 3.2 AOCC 3.2 7 14 21 28 35 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 32.26 32.17 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 4 - Input: Bosphorus 4K AMD AOCC 3.2 AOCC 3.2 0.3623 0.7246 1.0869 1.4492 1.8115 SE +/- 0.003, N = 3 SE +/- 0.001, N = 3 1.610 1.608 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 8 - Input: Bosphorus 4K AOCC 3.2 4 8 12 16 20 SE +/- 0.05, N = 3 15.72 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
SVT-HEVC This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p AMD AOCC 3.2 AOCC 3.2 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 9.17 9.21 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p AMD AOCC 3.2 AOCC 3.2 30 60 90 120 150 SE +/- 0.20, N = 3 SE +/- 0.12, N = 3 117.07 117.45 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p AMD AOCC 3.2 AOCC 3.2 50 100 150 200 250 SE +/- 0.38, N = 3 SE +/- 0.63, N = 3 231.07 229.83 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
SVT-VP9 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample YUV input video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p AMD AOCC 3.2 AOCC 3.2 30 60 90 120 150 SE +/- 0.23, N = 3 SE +/- 0.51, N = 3 151.30 151.74 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p AMD AOCC 3.2 AOCC 3.2 40 80 120 160 200 SE +/- 0.15, N = 3 SE +/- 1.08, N = 3 157.99 158.43 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p AOCC 3.2 30 60 90 120 150 SE +/- 0.60, N = 3 123.71 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
x265 This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K AMD AOCC 3.2 AOCC 3.2 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 11.82 11.63 1. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p AMD AOCC 3.2 AOCC 3.2 13 26 39 52 65 SE +/- 0.18, N = 3 SE +/- 0.49, N = 3 57.20 57.47 1. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.10 Model: yolov4 - Device: CPU AMD AOCC 3.2 60 120 180 240 300 SE +/- 0.58, N = 3 296 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.10 Model: fcn-resnet101-11 - Device: CPU AMD AOCC 3.2 12 24 36 48 60 SE +/- 2.57, N = 12 54 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.10 Model: shufflenet-v2-10 - Device: CPU AMD AOCC 3.2 5K 10K 15K 20K 25K SE +/- 460.99, N = 12 21250 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.10 Model: super-resolution-10 - Device: CPU AMD AOCC 3.2 700 1400 2100 2800 3500 SE +/- 2.46, N = 3 3100 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread
Chia Blockchain VDF Chia is a blockchain and smart transaction platform based on proofs of space and time rather than proofs of work with other cryptocurrencies. This test profile is benchmarking the CPU performance for Chia VDF performance using the Chia VDF benchmark. The Chia VDF is for the Chia Verifiable Delay Function (Proof of Time). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org IPS, More Is Better Chia Blockchain VDF 1.0.1 Test: Square Plain C++ AMD AOCC 3.2 AOCC 3.2 40K 80K 120K 160K 200K SE +/- 961.48, N = 3 SE +/- 560.75, N = 3 186667 187767 1. (CXX) g++ options: -flto -fno-PIE -lgmpxx -lgmp -lboost_system -pthread
OpenBenchmarking.org IPS, More Is Better Chia Blockchain VDF 1.0.1 Test: Square Assembly Optimized AMD AOCC 3.2 AOCC 3.2 30K 60K 90K 120K 150K SE +/- 1751.38, N = 15 SE +/- 635.96, N = 3 127313 128133 1. (CXX) g++ options: -flto -fno-PIE -lgmpxx -lgmp -lboost_system -pthread
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl AMD AOCC 3.2 AOCC 3.2 120 240 360 480 600 576 576 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate AMD AOCC 3.2 AOCC 3.2 200 400 600 800 1000 SE +/- 0.33, N = 3 799 785 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen AMD AOCC 3.2 AOCC 3.2 30 60 90 120 150 123 123 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced AMD AOCC 3.2 AOCC 3.2 40 80 120 160 200 192 192 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing AMD AOCC 3.2 AOCC 3.2 200 400 600 800 1000 SE +/- 6.17, N = 3 SE +/- 1.15, N = 3 963 958 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian AMD AOCC 3.2 AOCC 3.2 50 100 150 200 250 240 238 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space AMD AOCC 3.2 AOCC 3.2 140 280 420 560 700 SE +/- 0.33, N = 3 637 620 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS AMD AOCC 3.2 AOCC 3.2 60K 120K 180K 240K 300K SE +/- 775.46, N = 3 SE +/- 1160.02, N = 3 302196 300122 1. (CC) gcc options: -pedantic -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed AMD AOCC 3.2 AOCC 3.2 3K 6K 9K 12K 15K SE +/- 60.63, N = 3 SE +/- 59.35, N = 3 15163.7 15416.8 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed AMD AOCC 3.2 AOCC 3.2 3K 6K 9K 12K 15K SE +/- 10.69, N = 5 SE +/- 5.07, N = 3 14235.6 14443.4 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed AMD AOCC 3.2 AOCC 3.2 3K 6K 9K 12K 15K SE +/- 25.03, N = 3 SE +/- 5.45, N = 3 14302.3 14477.7 1. (CC) gcc options: -O3
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed AMD AOCC 3.2 AOCC 3.2 700 1400 2100 2800 3500 SE +/- 4.13, N = 3 SE +/- 5.75, N = 3 3251.9 3193.7 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed AMD AOCC 3.2 AOCC 3.2 900 1800 2700 3600 4500 SE +/- 4.62, N = 3 SE +/- 41.70, N = 2 4009.5 4081.1 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed AMD AOCC 3.2 AOCC 3.2 300 600 900 1200 1500 SE +/- 5.45, N = 3 SE +/- 9.90, N = 3 1396.2 1389.5 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed AMD AOCC 3.2 AOCC 3.2 900 1800 2700 3600 4500 SE +/- 9.09, N = 3 SE +/- 13.63, N = 3 4138.3 4181.2 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed AMD AOCC 3.2 AOCC 3.2 12 24 36 48 60 SE +/- 0.17, N = 3 SE +/- 0.10, N = 3 51.7 51.6 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed AMD AOCC 3.2 AOCC 3.2 800 1600 2400 3200 4000 SE +/- 38.85, N = 3 SE +/- 36.61, N = 3 3717.6 3648.8 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed AMD AOCC 3.2 AOCC 3.2 100 200 300 400 500 SE +/- 1.57, N = 3 SE +/- 1.12, N = 3 443.2 438.2 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed AMD AOCC 3.2 AOCC 3.2 1000 2000 3000 4000 5000 SE +/- 35.14, N = 3 SE +/- 30.47, N = 3 4392.4 4432.3 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Compression Speed AMD AOCC 3.2 AOCC 3.2 200 400 600 800 1000 SE +/- 1.12, N = 3 SE +/- 7.35, N = 3 814.0 811.4 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Decompression Speed AMD AOCC 3.2 AOCC 3.2 1000 2000 3000 4000 5000 SE +/- 56.68, N = 3 SE +/- 33.52, N = 3 4495.3 4551.4 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed AMD AOCC 3.2 AOCC 3.2 10 20 30 40 50 SE +/- 0.44, N = 3 SE +/- 0.09, N = 3 42.2 41.8 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed AMD AOCC 3.2 AOCC 3.2 800 1600 2400 3200 4000 SE +/- 15.81, N = 3 SE +/- 33.83, N = 3 3731.5 3724.2 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 AMD AOCC 3.2 AOCC 3.2 700 1400 2100 2800 3500 SE +/- 4.72, N = 3 SE +/- 3.76, N = 3 3208.6 3212.8 1. (CXX) g++ options: -O3 -march=native -rdynamic
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI AMD AOCC 3.2 AOCC 3.2 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 98.00 98.05 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt AMD AOCC 3.2 AOCC 3.2 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 97.14 97.17 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 AMD AOCC 3.2 AOCC 3.2 1100 2200 3300 4400 5500 SE +/- 10.99, N = 3 SE +/- 11.83, N = 3 5356.22 5358.65 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt AMD AOCC 3.2 AOCC 3.2 1200 2400 3600 4800 6000 SE +/- 11.79, N = 3 SE +/- 24.03, N = 3 5359.63 5369.01 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish AMD AOCC 3.2 AOCC 3.2 80 160 240 320 400 SE +/- 0.13, N = 3 SE +/- 0.27, N = 3 368.11 366.83 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt AMD AOCC 3.2 AOCC 3.2 90 180 270 360 450 SE +/- 0.14, N = 3 SE +/- 0.29, N = 3 397.06 396.72 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish AMD AOCC 3.2 AOCC 3.2 90 180 270 360 450 SE +/- 0.01, N = 3 SE +/- 0.12, N = 3 415.01 414.60 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt AMD AOCC 3.2 AOCC 3.2 90 180 270 360 450 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 430.15 429.99 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 AMD AOCC 3.2 AOCC 3.2 30 60 90 120 150 SE +/- 0.14, N = 3 SE +/- 0.04, N = 3 157.58 157.75 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt AMD AOCC 3.2 AOCC 3.2 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 154.61 154.60 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 AMD AOCC 3.2 AOCC 3.2 200 400 600 800 1000 SE +/- 12.67, N = 3 SE +/- 12.18, N = 3 906.28 902.82 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt AMD AOCC 3.2 AOCC 3.2 200 400 600 800 1000 SE +/- 6.86, N = 3 SE +/- 8.98, N = 3 896.22 885.90 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 7 AMD AOCC 3.2 AOCC 3.2 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 9.07 9.15 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 8 AMD AOCC 3.2 AOCC 3.2 0.207 0.414 0.621 0.828 1.035 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.92 0.91 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 AMD AOCC 3.2 AOCC 3.2 20 40 60 80 100 SE +/- 0.08, N = 3 SE +/- 0.41, N = 3 76.07 77.94 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 AMD AOCC 3.2 AOCC 3.2 7 14 21 28 35 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 28.69 28.93 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie
JPEG XL Decoding libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. The JPEG XL encoding/decoding is done using the libjxl codebase. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding libjxl 0.6.1 CPU Threads: 1 AMD AOCC 3.2 AOCC 3.2 15 30 45 60 75 SE +/- 0.13, N = 3 SE +/- 0.07, N = 3 66.73 68.21
Etcpak Etcpack is the self-proclaimed "fastest ETC compressor on the planet" with focused on providing open-source, very fast ETC and S3 texture compression support. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 AMD AOCC 3.2 AOCC 3.2 700 1400 2100 2800 3500 SE +/- 6.89, N = 3 SE +/- 2.62, N = 3 3032.89 3042.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 AMD AOCC 3.2 AOCC 3.2 50 100 150 200 250 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 245.96 245.74 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 AMD AOCC 3.2 AOCC 3.2 50 100 150 200 250 SE +/- 0.02, N = 3 SE +/- 0.64, N = 3 231.49 230.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering AMD AOCC 3.2 AOCC 3.2 60 120 180 240 300 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 296.99 297.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
LeelaChessZero LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS AMD AOCC 3.2 AOCC 3.2 400 800 1200 1600 2000 SE +/- 24.17, N = 3 SE +/- 14.15, N = 3 1980 1946 1. (CXX) g++ options: -flto -O3 -march=native -pthread
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen AMD AOCC 3.2 AOCC 3.2 500 1000 1500 2000 2500 SE +/- 14.26, N = 3 SE +/- 23.00, N = 3 2085 2122 1. (CXX) g++ options: -flto -O3 -march=native -pthread
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time AMD AOCC 3.2 AOCC 3.2 6M 12M 18M 24M 30M SE +/- 117326.99, N = 3 SE +/- 115127.05, N = 3 25833642 25886541 1. (CXX) g++ options: -fprofile-use -m64 -lpthread -O3 -march=native -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -flto
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 1 AMD AOCC 3.2 11K 22K 33K 44K 55K SE +/- 275.54, N = 3 52167.21 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 20 AMD AOCC 3.2 40K 80K 120K 160K 200K SE +/- 181.54, N = 3 202421.97 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 200 AMD AOCC 3.2 40K 80K 120K 160K 200K SE +/- 579.54, N = 3 201097.54 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
Apache HTTP Server This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 1 AMD AOCC 3.2 1400 2800 4200 5600 7000 SE +/- 59.67, N = 7 6329.40 1. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.2 AOCC 3.2 14M 28M 42M 56M 70M SE +/- 66838.94, N = 3 SE +/- 37860.86, N = 3 66306667 66342333 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.2 AOCC 3.2 30M 60M 90M 120M 150M SE +/- 155241.75, N = 3 SE +/- 132035.35, N = 3 132660000 132690000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.2 AOCC 3.2 60M 120M 180M 240M 300M SE +/- 1730086.70, N = 3 SE +/- 1809570.24, N = 3 264220000 263413333 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.2 AOCC 3.2 110M 220M 330M 440M 550M SE +/- 652337.68, N = 3 SE +/- 1036505.88, N = 3 518416667 520683333 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.2 AOCC 3.2 150M 300M 450M 600M 750M SE +/- 162583.31, N = 3 SE +/- 75498.34, N = 3 695340000 695740000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
Google SynthMark SynthMark is a cross platform tool for benchmarking CPU performance under a variety of real-time audio workloads. It uses a polyphonic synthesizer model to provide standardized tests for latency, jitter and computational throughput. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 AMD AOCC 3.2 AOCC 3.2 150 300 450 600 750 SE +/- 3.97, N = 3 SE +/- 0.34, N = 3 683.59 691.26 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Default AMD AOCC 3.2 AOCC 3.2 0.2597 0.5194 0.7791 1.0388 1.2985 SE +/- 0.002, N = 3 SE +/- 0.006, N = 3 1.150 1.154 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 AMD AOCC 3.2 AOCC 3.2 0.4345 0.869 1.3035 1.738 2.1725 SE +/- 0.002, N = 3 SE +/- 0.004, N = 3 1.931 1.931 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless AMD AOCC 3.2 AOCC 3.2 4 8 12 16 20 SE +/- 0.09, N = 3 SE +/- 0.08, N = 3 16.37 16.36 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression AMD AOCC 3.2 AOCC 3.2 1.2503 2.5006 3.7509 5.0012 6.2515 SE +/- 0.000, N = 3 SE +/- 0.013, N = 3 5.548 5.557 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression AMD AOCC 3.2 AOCC 3.2 8 16 24 32 40 SE +/- 0.11, N = 3 SE +/- 0.02, N = 3 33.28 33.04 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU AMD AOCC 3.2 AOCC 3.2 0.8388 1.6776 2.5164 3.3552 4.194 SE +/- 0.01119, N = 3 SE +/- 0.01100, N = 3 3.71990 3.72784 MIN: 3.53 MIN: 3.47 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU AMD AOCC 3.2 AOCC 3.2 0.4297 0.8594 1.2891 1.7188 2.1485 SE +/- 0.00196, N = 3 SE +/- 0.00539, N = 3 1.90323 1.90977 MIN: 1.64 MIN: 1.63 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU AMD AOCC 3.2 AOCC 3.2 0.873 1.746 2.619 3.492 4.365 SE +/- 0.00042, N = 3 SE +/- 0.00164, N = 3 3.87713 3.88008 MIN: 3.76 MIN: 3.77 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU AMD AOCC 3.2 AOCC 3.2 1.0822 2.1644 3.2466 4.3288 5.411 SE +/- 0.00259, N = 3 SE +/- 0.01634, N = 3 4.80136 4.80995 MIN: 4.46 MIN: 4.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU AMD AOCC 3.2 AOCC 3.2 2 4 6 8 10 SE +/- 0.00595, N = 3 SE +/- 0.00652, N = 3 6.67736 6.68384 MIN: 6.52 MIN: 6.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU AMD AOCC 3.2 AOCC 3.2 700 1400 2100 2800 3500 SE +/- 3.01, N = 3 SE +/- 3.04, N = 3 3415.20 3415.99 MIN: 3405.21 MIN: 3407.26 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU AMD AOCC 3.2 AOCC 3.2 400 800 1200 1600 2000 SE +/- 0.48, N = 3 SE +/- 0.88, N = 3 1753.94 1756.65 MIN: 1749.78 MIN: 1750.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU AMD AOCC 3.2 AOCC 3.2 0.2703 0.5406 0.8109 1.0812 1.3515 SE +/- 0.00222, N = 3 SE +/- 0.00374, N = 3 1.19844 1.20153 MIN: 1.06 MIN: 1.06 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Lion AMD AOCC 3.2 1100 2200 3300 4400 5500 SE +/- 9.61, N = 3 5034 1. (CXX) g++ options: -O3 -march=native
Model: Church Facade
AMD AOCC 3.2: The test quit with a non-zero exit status.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 AMD AOCC 3.2 0.9585 1.917 2.8755 3.834 4.7925 SE +/- 0.02, N = 3 4.26 MIN: 4.15 / MAX: 4.92 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 AMD AOCC 3.2 0.8528 1.7056 2.5584 3.4112 4.264 SE +/- 0.02, N = 3 3.79 MIN: 3.69 / MAX: 4.26 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: shufflenet-v2 AMD AOCC 3.2 1.1115 2.223 3.3345 4.446 5.5575 SE +/- 0.04, N = 3 4.94 MIN: 4.79 / MAX: 6.89 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mnasnet AMD AOCC 3.2 0.8685 1.737 2.6055 3.474 4.3425 SE +/- 0.01, N = 3 3.86 MIN: 3.79 / MAX: 4.3 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: efficientnet-b0 AMD AOCC 3.2 1.2488 2.4976 3.7464 4.9952 6.244 SE +/- 0.01, N = 3 5.55 MIN: 5.46 / MAX: 6.26 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: blazeface AMD AOCC 3.2 0.4343 0.8686 1.3029 1.7372 2.1715 SE +/- 0.00, N = 3 1.93 MIN: 1.89 / MAX: 2.56 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: googlenet AMD AOCC 3.2 3 6 9 12 15 SE +/- 0.16, N = 3 11.83 MIN: 11.47 / MAX: 20.56 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: vgg16 AMD AOCC 3.2 7 14 21 28 35 SE +/- 0.07, N = 3 29.20 MIN: 28.4 / MAX: 30.36 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet18 AMD AOCC 3.2 3 6 9 12 15 SE +/- 0.03, N = 3 9.11 MIN: 8.35 / MAX: 19.7 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: alexnet AMD AOCC 3.2 1.2645 2.529 3.7935 5.058 6.3225 SE +/- 0.01, N = 3 5.62 MIN: 5.28 / MAX: 16.19 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet50 AMD AOCC 3.2 4 8 12 16 20 SE +/- 0.00, N = 3 16.71 MIN: 15.97 / MAX: 17.44 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: yolov4-tiny AMD AOCC 3.2 5 10 15 20 25 SE +/- 0.06, N = 3 21.46 MIN: 21.22 / MAX: 22.1 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: squeezenet_ssd AMD AOCC 3.2 4 8 12 16 20 SE +/- 0.02, N = 3 18.27 MIN: 17.97 / MAX: 18.95 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: regnety_400m AMD AOCC 3.2 3 6 9 12 15 SE +/- 0.03, N = 3 10.12 MIN: 9.94 / MAX: 10.73 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet AMD AOCC 3.2 800 1600 2400 3200 4000 SE +/- 16.11, N = 3 3763.13 MIN: 3578.21 / MAX: 4042.76 1. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 AMD AOCC 3.2 90 180 270 360 450 SE +/- 1.83, N = 3 395.53 MIN: 390.5 / MAX: 402.38 1. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 AMD AOCC 3.2 12 24 36 48 60 SE +/- 0.03, N = 3 55.13 MIN: 54.6 / MAX: 55.49 1. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 AMD AOCC 3.2 50 100 150 200 250 SE +/- 0.21, N = 3 240.49 MIN: 240.08 / MAX: 241.08 1. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
Test: Object Detection
AMD AOCC 3.2: The test quit with a non-zero exit status.
Test: DNN - Deep Neural Network
AMD AOCC 3.2: ./opencv_perf_dnn: symbol lookup error: ./opencv_perf_dnn: undefined symbol: _ZN2cv3dnn14dnn4_v2021100419getAvailableTargetsENS1_7BackendE
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time AMD AOCC 3.2 AOCC 3.2 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 47.82 47.80 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
Primesieve Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 7.7 1e12 Prime Number Generation AMD AOCC 3.2 AOCC 3.2 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 20.95 20.95 1. (CXX) g++ options: -O3 -march=native -lpthread
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 AMD AOCC 3.2 AOCC 3.2 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.13, N = 3 88.42 88.66 1. (CC) gcc options: -O3 -march=native -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 AMD AOCC 3.2 AOCC 3.2 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.78, N = 5 77.85 77.03 1. (CC) gcc options: -O3 -march=native -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE
RNNoise RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 AMD AOCC 3.2 AOCC 3.2 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 16.84 16.86 1. (CC) gcc options: -O3 -march=native -pedantic -fvisibility=hidden
Basis Universal Basis Universal is a GPU texture codec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S AMD AOCC 3.2 AOCC 3.2 7 14 21 28 35 SE +/- 0.15, N = 3 SE +/- 0.04, N = 3 27.96 27.47 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 AMD AOCC 3.2 AOCC 3.2 2 4 6 8 10 SE +/- 0.006, N = 3 SE +/- 0.002, N = 3 6.768 6.742 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 AMD AOCC 3.2 AOCC 3.2 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 27.53 27.51 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 AMD AOCC 3.2 AOCC 3.2 12 24 36 48 60 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 51.15 51.14 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
KTX-Software toktx This is a benchmark of The Khronos Group's KTX-Software library and tools. KTX-Software provides "toktx" for converting/creating in the KTX container format for image textures. This benchmark times how long it takes to convert to KTX 2.0 format with various settings using a reference PNG sample input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 AMD AOCC 3.2 AOCC 3.2 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 11.21 11.18
AOCC 3.2 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3Processor Notes: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.9.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 17 December 2021 04:13 by user phoronix.
AMD AOCC 3.2 Processor: AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 21.04, Kernel: 5.14.0-rc7-amd-pstate-phx (x86_64) 20210909, Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 13.0.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3Processor Notes: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.9.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 17 December 2021 14:11 by user phoronix.