aocc-testing

AMD EPYC 72F3 8-Core testing with a Supermicro H12SSL-i v1.01 (2.0 BIOS) and ASPEED on Ubuntu 21.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2112184-TJ-AOCCTESTI36&grr&sro.

aocc-testing ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionAOCC 3.2AMD AOCC 3.2AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads)Supermicro H12SSL-i v1.01 (2.0 BIOS)AMD Starship/Matisse126GB3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600ASPEEDVE2282 x Broadcom NetXtreme BCM5720 2-port PCIeUbuntu 21.045.14.0-rc7-amd-pstate-phx (x86_64) 20210909GNOME Shell 3.38.4X ServerClang 13.0.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Details- Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3 Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119 Python Details- Python 3.9.5Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

aocc-testing cpp-perf-bench: Rand Numbersmt-dgemm: Sustained Floating-Point Rateonnx: fcn-resnet101-11 - CPUonnx: shufflenet-v2-10 - CPUlczero: Eigenlczero: BLASjpegxl: PNG - 8toktx: UASTC 4 + Zstd Compression 19himeno: Poisson Pressure Solversecuremark: SecureMark-TLStnn: CPU - DenseNetcpp-perf-bench: Math Libraryapache: 1onnx: yolov4 - CPUonnx: super-resolution-10 - CPUjpegxl: PNG - 7chia-vdf: Square Assembly Optimizedngspice: C7552svt-av1: Preset 4 - Bosphorus 4Kapache: 500nginx: 20nginx: 200apache: 200nginx: 1ngspice: C2670hmmer: Pfam Database Searchonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUcompress-lz4: 3 - Decompression Speedcompress-lz4: 3 - Compression Speedkvazaar: Bosphorus 4K - Mediumsvt-hevc: 1 - Bosphorus 1080pgraphics-magick: Sharpengraphics-magick: Noise-Gaussiangraphics-magick: Swirlgraphics-magick: Enhancedgraphics-magick: HWB Color Spacegraphics-magick: Resizinggraphics-magick: Rotatencnn: CPU - regnety_400mncnn: CPU - squeezenet_ssdncnn: CPU - yolov4-tinyncnn: CPU - resnet50ncnn: CPU - alexnetncnn: CPU - resnet18ncnn: CPU - vgg16ncnn: CPU - googlenetncnn: CPU - blazefacencnn: CPU - efficientnet-b0ncnn: CPU - mnasnetncnn: CPU - shufflenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU - mobilenetcompress-lz4: 9 - Decompression Speedcompress-lz4: 9 - Compression Speedcpp-perf-bench: Stepanov Vectorx265: Bosphorus 4Kbasis: UASTC Level 3povray: Trace Timecompress-zstd: 19, Long Mode - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcpp-perf-bench: Atoljpegxl-decode: 1cpp-perf-bench: Ctypecompress-zstd: 19 - Decompression Speedcompress-zstd: 19 - Compression Speedsvt-av1: Preset 8 - Bosphorus 4Kcompress-zstd: 3, Long Mode - Decompression Speedcompress-zstd: 3, Long Mode - Compression Speedcompress-zstd: 8 - Decompression Speedcompress-zstd: 8 - Compression Speedcompress-zstd: 8, Long Mode - Decompression Speedcompress-zstd: 8, Long Mode - Compression Speedcompress-zstd: 3 - Decompression Speedcompress-zstd: 3 - Compression Speedbotan: AES-256 - Decryptbotan: AES-256stockfish: Total Timewebp: Quality 100, Lossless, Highest Compressioncompress-lz4: 1 - Decompression Speedcompress-lz4: 1 - Compression Speedkvazaar: Bosphorus 4K - Very Fastbotan: ChaCha20Poly1305 - Decryptbotan: ChaCha20Poly1305botan: Blowfish - Decryptbotan: Blowfishbotan: Twofish - Decryptbotan: Twofishbotan: CAST-256 - Decryptbotan: CAST-256botan: KASUMI - Decryptbotan: KASUMIsynthmark: VoiceMark_100basis: ETC1Stnn: CPU - MobileNet v2basis: UASTC Level 2chia-vdf: Square Plain C++encode-flac: WAV To FLACquantlib: tjbench: Decompression Throughputdav1d: Chimera 1080p 10-bitcpp-perf-bench: Stepanov Abstractionjpegxl-decode: Allcoremark: CoreMark Size 666 - Iterations Per Secondonednn: Deconvolution Batch shapes_1d - f32 - CPUetcpak: ETC2primesieve: 1e12 Prime Number Generationdav1d: Chimera 1080pliquid-dsp: 4 - 256 - 57liquid-dsp: 2 - 256 - 57liquid-dsp: 1 - 256 - 57liquid-dsp: 16 - 256 - 57liquid-dsp: 8 - 256 - 57etcpak: ETC1dav1d: Summer Nature 4Kkvazaar: Bosphorus 4K - Ultra Fasttoktx: Zstd Compression 19tnn: CPU - SqueezeNet v1.1rnnoise: toktx: UASTC 3 + Zstd Compression 19webp: Quality 100, Losslessetcpak: ETC1 + Ditheringonednn: IP Shapes 1D - f32 - CPUcpp-perf-bench: Function Objectsjpegxl: JPEG - 7onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUtoktx: UASTC 3x265: Bosphorus 1080pjpegxl: JPEG - 8onednn: IP Shapes 3D - f32 - CPUdav1d: Summer Nature 1080pencode-mp3: WAV To MP3basis: UASTC Level 0draco: Liononednn: Convolution Batch Shapes Auto - f32 - CPUwebp: Quality 100, Highest Compressionsvt-hevc: 7 - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080psvt-vp9: VMAF Optimized - Bosphorus 1080psvt-vp9: PSNR/SSIM Optimized - Bosphorus 1080ptnn: CPU - SqueezeNet v2onednn: Deconvolution Batch shapes_3d - f32 - CPUsvt-hevc: 10 - Bosphorus 1080ptoktx: Zstd Compression 9webp: Quality 100etcpak: DXT1webp: Defaultopencv: DNN - Deep Neural NetworkAOCC 3.2AMD AOCC 3.21448.4180.741814212219460.914142.062390300122255.1939.1512813377.0271.60888.66287.1793415.991756.6514443.459.198.369.2112323857619262095878514477.755.8755.09911.6351.13847.8023724.241.843.50768.2141.0593648.851.615.7244432.3438.24181.21389.54551.4811.44081.13193.75369.0105358.6532588654133.03915416.813729.3719.17885.902902.821429.992414.599396.717366.825154.596157.75297.16898.053691.26127.46827.51218776715.8553212.8244.695059452.4223.603307.23358537.8925514.80995230.83820.95226341333313269000066342333695740000520683333245.744199.9832.1717.42416.86316.53516.363297.1183.7278414.39877.941.2015311.18157.4728.931.909776.9716.7423.880085.557117.45123.71151.74158.436.68384229.832.4441.9313042.9871.1541447.9090.9101755421250208519800.92310.5254188.5168513021963763.132255.0586329.4029631009.0712731377.8521.61084374.51202421.97201097.5489516.5252167.2188.42187.2983415.201753.9414235.660.678.379.1712324057619263796379910.1218.2721.4616.715.629.1129.2011.831.935.553.864.943.794.2612.5414302.355.7755.12711.8251.15447.8153731.542.243.75966.7341.5693717.651.74392.4443.24138.31396.24495.3814.04009.53251.95359.6345356.2222583364233.28415163.713906.9219.17896.219906.277430.148415.013397.061368.107154.610157.57797.14398.001683.58527.962395.52927.53218666715.8653208.6244.117267454.4823.597298.89358684.3307654.80136231.48920.954538.8426422000013266000066306667695340000518416667245.957200.3332.2617.558240.48516.84016.56816.373296.9863.7199014.67676.071.1984411.21157.2028.691.90323504.536.9686.76850343.877135.548117.07151.30157.9955.1306.67736231.072.4471.9313032.8881.150OpenBenchmarking.org

CppPerformanceBenchmarks

Test: Random Numbers

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Random NumbersAMD AOCC 3.2AOCC 3.230060090012001500SE +/- 0.10, N = 3SE +/- 0.21, N = 31447.911448.421. (CXX) g++ options: -O3 -march=native -std=c++11

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateAMD AOCC 3.2AOCC 3.20.20480.40960.61440.81921.024SE +/- 0.005076, N = 3SE +/- 0.097317, N = 90.9101750.7418141. (CC) gcc options: -O3 -march=native -fopenmp

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.10Model: fcn-resnet101-11 - Device: CPUAMD AOCC 3.21224364860SE +/- 2.57, N = 12541. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread

ONNX Runtime

Model: shufflenet-v2-10 - Device: CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.10Model: shufflenet-v2-10 - Device: CPUAMD AOCC 3.25K10K15K20K25KSE +/- 460.99, N = 12212501. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenAMD AOCC 3.2AOCC 3.25001000150020002500SE +/- 14.26, N = 3SE +/- 23.00, N = 3208521221. (CXX) g++ options: -flto -O3 -march=native -pthread

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASAMD AOCC 3.2AOCC 3.2400800120016002000SE +/- 24.17, N = 3SE +/- 14.15, N = 3198019461. (CXX) g++ options: -flto -O3 -march=native -pthread

JPEG XL libjxl

Input: PNG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: PNG - Encode Speed: 8AMD AOCC 3.2AOCC 3.20.2070.4140.6210.8281.035SE +/- 0.00, N = 3SE +/- 0.00, N = 30.920.911. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie

KTX-Software toktx

Settings: UASTC 4 + Zstd Compression 19

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: UASTC 4 + Zstd Compression 19AMD AOCC 3.270140210280350SE +/- 0.06, N = 3310.53

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverAMD AOCC 3.2AOCC 3.29001800270036004500SE +/- 47.93, N = 15SE +/- 47.80, N = 154188.524142.061. (CC) gcc options: -O3 -march=native -mavx2

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSAMD AOCC 3.2AOCC 3.260K120K180K240K300KSE +/- 775.46, N = 3SE +/- 1160.02, N = 33021963001221. (CC) gcc options: -pedantic -O3

TNN

Target: CPU - Model: DenseNet

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetAMD AOCC 3.28001600240032004000SE +/- 16.11, N = 33763.13MIN: 3578.21 / MAX: 4042.761. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

CppPerformanceBenchmarks

Test: Math Library

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Math LibraryAMD AOCC 3.2AOCC 3.260120180240300SE +/- 0.20, N = 3SE +/- 0.58, N = 3255.06255.191. (CXX) g++ options: -O3 -march=native -std=c++11

Apache HTTP Server

Concurrent Requests: 1

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 1AMD AOCC 3.214002800420056007000SE +/- 59.67, N = 76329.401. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

ONNX Runtime

Model: yolov4 - Device: CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.10Model: yolov4 - Device: CPUAMD AOCC 3.260120180240300SE +/- 0.58, N = 32961. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread

ONNX Runtime

Model: super-resolution-10 - Device: CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.10Model: super-resolution-10 - Device: CPUAMD AOCC 3.27001400210028003500SE +/- 2.46, N = 331001. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread

JPEG XL libjxl

Input: PNG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: PNG - Encode Speed: 7AMD AOCC 3.2AOCC 3.23691215SE +/- 0.01, N = 3SE +/- 0.00, N = 39.079.151. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie

Chia Blockchain VDF

Test: Square Assembly Optimized

OpenBenchmarking.orgIPS, More Is BetterChia Blockchain VDF 1.0.1Test: Square Assembly OptimizedAMD AOCC 3.2AOCC 3.230K60K90K120K150KSE +/- 1751.38, N = 15SE +/- 635.96, N = 31273131281331. (CXX) g++ options: -flto -fno-PIE -lgmpxx -lgmp -lboost_system -pthread

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552AMD AOCC 3.2AOCC 3.220406080100SE +/- 0.00, N = 3SE +/- 0.78, N = 577.8577.031. (CC) gcc options: -O3 -march=native -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 4 - Input: Bosphorus 4KAMD AOCC 3.2AOCC 3.20.36230.72461.08691.44921.8115SE +/- 0.003, N = 3SE +/- 0.001, N = 31.6101.6081. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

Apache HTTP Server

Concurrent Requests: 500

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 500AMD AOCC 3.220K40K60K80K100KSE +/- 185.04, N = 384374.511. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

nginx

Concurrent Requests: 20

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 20AMD AOCC 3.240K80K120K160K200KSE +/- 181.54, N = 3202421.971. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

nginx

Concurrent Requests: 200

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 200AMD AOCC 3.240K80K120K160K200KSE +/- 579.54, N = 3201097.541. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

Apache HTTP Server

Concurrent Requests: 200

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 200AMD AOCC 3.220K40K60K80K100KSE +/- 276.94, N = 389516.521. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

nginx

Concurrent Requests: 1

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 1AMD AOCC 3.211K22K33K44K55KSE +/- 275.54, N = 352167.211. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670AMD AOCC 3.2AOCC 3.220406080100SE +/- 0.10, N = 3SE +/- 0.13, N = 388.4288.661. (CC) gcc options: -O3 -march=native -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 3.3.2Pfam Database SearchAMD AOCC 3.2AOCC 3.220406080100SE +/- 0.09, N = 3SE +/- 0.02, N = 387.3087.181. (CC) gcc options: -O3 -march=native -pthread -lhmmer -leasel -lm -lmpi

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAMD AOCC 3.2AOCC 3.27001400210028003500SE +/- 3.01, N = 3SE +/- 3.04, N = 33415.203415.99MIN: 3405.21MIN: 3407.261. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAMD AOCC 3.2AOCC 3.2400800120016002000SE +/- 0.48, N = 3SE +/- 0.88, N = 31753.941756.65MIN: 1749.78MIN: 1750.391. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

LZ4 Compression

Compression Level: 3 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 3 - Decompression SpeedAMD AOCC 3.2AOCC 3.23K6K9K12K15KSE +/- 10.69, N = 5SE +/- 5.07, N = 314235.614443.41. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 3 - Compression SpeedAMD AOCC 3.2AOCC 3.21428425670SE +/- 0.67, N = 5SE +/- 0.01, N = 360.6759.191. (CC) gcc options: -O3

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: MediumAMD AOCC 3.2AOCC 3.2246810SE +/- 0.01, N = 3SE +/- 0.01, N = 38.378.361. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

SVT-HEVC

Tuning: 1 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080pAMD AOCC 3.2AOCC 3.23691215SE +/- 0.01, N = 3SE +/- 0.01, N = 39.179.211. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenAMD AOCC 3.2AOCC 3.23060901201501231231. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-GaussianAMD AOCC 3.2AOCC 3.2501001502002502402381. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SwirlAMD AOCC 3.2AOCC 3.21202403604806005765761. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedAMD AOCC 3.2AOCC 3.240801201602001921921. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color SpaceAMD AOCC 3.2AOCC 3.2140280420560700SE +/- 0.33, N = 36376201. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: ResizingAMD AOCC 3.2AOCC 3.22004006008001000SE +/- 6.17, N = 3SE +/- 1.15, N = 39639581. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateAMD AOCC 3.2AOCC 3.22004006008001000SE +/- 0.33, N = 37997851. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: regnety_400mAMD AOCC 3.23691215SE +/- 0.03, N = 310.12MIN: 9.94 / MAX: 10.731. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: squeezenet_ssdAMD AOCC 3.248121620SE +/- 0.02, N = 318.27MIN: 17.97 / MAX: 18.951. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: yolov4-tinyAMD AOCC 3.2510152025SE +/- 0.06, N = 321.46MIN: 21.22 / MAX: 22.11. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: resnet50AMD AOCC 3.248121620SE +/- 0.00, N = 316.71MIN: 15.97 / MAX: 17.441. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: alexnetAMD AOCC 3.21.26452.5293.79355.0586.3225SE +/- 0.01, N = 35.62MIN: 5.28 / MAX: 16.191. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: resnet18AMD AOCC 3.23691215SE +/- 0.03, N = 39.11MIN: 8.35 / MAX: 19.71. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: vgg16AMD AOCC 3.2714212835SE +/- 0.07, N = 329.20MIN: 28.4 / MAX: 30.361. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: googlenetAMD AOCC 3.23691215SE +/- 0.16, N = 311.83MIN: 11.47 / MAX: 20.561. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: blazefaceAMD AOCC 3.20.43430.86861.30291.73722.1715SE +/- 0.00, N = 31.93MIN: 1.89 / MAX: 2.561. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: efficientnet-b0AMD AOCC 3.21.24882.49763.74644.99526.244SE +/- 0.01, N = 35.55MIN: 5.46 / MAX: 6.261. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: mnasnetAMD AOCC 3.20.86851.7372.60553.4744.3425SE +/- 0.01, N = 33.86MIN: 3.79 / MAX: 4.31. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: shufflenet-v2AMD AOCC 3.21.11152.2233.33454.4465.5575SE +/- 0.04, N = 34.94MIN: 4.79 / MAX: 6.891. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU-v3-v3 - Model: mobilenet-v3AMD AOCC 3.20.85281.70562.55843.41124.264SE +/- 0.02, N = 33.79MIN: 3.69 / MAX: 4.261. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU-v2-v2 - Model: mobilenet-v2AMD AOCC 3.20.95851.9172.87553.8344.7925SE +/- 0.02, N = 34.26MIN: 4.15 / MAX: 4.921. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: mobilenetAMD AOCC 3.23691215SE +/- 0.01, N = 312.54MIN: 12.29 / MAX: 13.521. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

LZ4 Compression

Compression Level: 9 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 9 - Decompression SpeedAMD AOCC 3.2AOCC 3.23K6K9K12K15KSE +/- 25.03, N = 3SE +/- 5.45, N = 314302.314477.71. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 9 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 9 - Compression SpeedAMD AOCC 3.2AOCC 3.21326395265SE +/- 0.52, N = 3SE +/- 0.54, N = 355.7755.871. (CC) gcc options: -O3

CppPerformanceBenchmarks

Test: Stepanov Vector

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov VectorAMD AOCC 3.2AOCC 3.21224364860SE +/- 0.03, N = 3SE +/- 0.00, N = 355.1355.101. (CXX) g++ options: -O3 -march=native -std=c++11

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4KAMD AOCC 3.2AOCC 3.23691215SE +/- 0.08, N = 3SE +/- 0.07, N = 311.8211.631. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 3AMD AOCC 3.2AOCC 3.21224364860SE +/- 0.00, N = 3SE +/- 0.01, N = 351.1551.141. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-Ray 3.7.0.7Trace TimeAMD AOCC 3.2AOCC 3.21122334455SE +/- 0.01, N = 3SE +/- 0.06, N = 347.8247.801. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression SpeedAMD AOCC 3.2AOCC 3.28001600240032004000SE +/- 15.81, N = 3SE +/- 33.83, N = 33731.53724.21. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression SpeedAMD AOCC 3.2AOCC 3.21020304050SE +/- 0.44, N = 3SE +/- 0.09, N = 342.241.81. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

CppPerformanceBenchmarks

Test: Atol

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: AtolAMD AOCC 3.2AOCC 3.21020304050SE +/- 0.05, N = 3SE +/- 0.07, N = 343.7643.511. (CXX) g++ options: -O3 -march=native -std=c++11

JPEG XL Decoding libjxl

CPU Threads: 1

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding libjxl 0.6.1CPU Threads: 1AMD AOCC 3.2AOCC 3.21530456075SE +/- 0.13, N = 3SE +/- 0.07, N = 366.7368.21

CppPerformanceBenchmarks

Test: Ctype

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: CtypeAMD AOCC 3.2AOCC 3.2918273645SE +/- 0.32, N = 3SE +/- 0.03, N = 341.5741.061. (CXX) g++ options: -O3 -march=native -std=c++11

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression SpeedAMD AOCC 3.2AOCC 3.28001600240032004000SE +/- 38.85, N = 3SE +/- 36.61, N = 33717.63648.81. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression SpeedAMD AOCC 3.2AOCC 3.21224364860SE +/- 0.17, N = 3SE +/- 0.10, N = 351.751.61. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 8 - Input: Bosphorus 4KAOCC 3.248121620SE +/- 0.05, N = 315.721. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

Zstd Compression

Compression Level: 3, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Decompression SpeedAMD AOCC 3.2AOCC 3.210002000300040005000SE +/- 35.14, N = 3SE +/- 30.47, N = 34392.44432.31. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Compression SpeedAMD AOCC 3.2AOCC 3.2100200300400500SE +/- 1.57, N = 3SE +/- 1.12, N = 3443.2438.21. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Decompression SpeedAMD AOCC 3.2AOCC 3.29001800270036004500SE +/- 9.09, N = 3SE +/- 13.63, N = 34138.34181.21. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Compression SpeedAMD AOCC 3.2AOCC 3.230060090012001500SE +/- 5.45, N = 3SE +/- 9.90, N = 31396.21389.51. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8, Long Mode - Decompression SpeedAMD AOCC 3.2AOCC 3.210002000300040005000SE +/- 56.68, N = 3SE +/- 33.52, N = 34495.34551.41. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8, Long Mode - Compression SpeedAMD AOCC 3.2AOCC 3.22004006008001000SE +/- 1.12, N = 3SE +/- 7.35, N = 3814.0811.41. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Decompression SpeedAMD AOCC 3.2AOCC 3.29001800270036004500SE +/- 4.62, N = 3SE +/- 41.70, N = 24009.54081.11. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Compression SpeedAMD AOCC 3.2AOCC 3.27001400210028003500SE +/- 4.13, N = 3SE +/- 5.75, N = 33251.93193.71. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Botan

Test: AES-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256 - DecryptAMD AOCC 3.2AOCC 3.212002400360048006000SE +/- 11.79, N = 3SE +/- 24.03, N = 35359.635369.011. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: AES-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256AMD AOCC 3.2AOCC 3.211002200330044005500SE +/- 10.99, N = 3SE +/- 11.83, N = 35356.225358.651. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total TimeAMD AOCC 3.2AOCC 3.26M12M18M24M30MSE +/- 117326.99, N = 3SE +/- 115127.05, N = 325833642258865411. (CXX) g++ options: -fprofile-use -m64 -lpthread -O3 -march=native -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -flto

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Lossless, Highest CompressionAMD AOCC 3.2AOCC 3.2816243240SE +/- 0.11, N = 3SE +/- 0.02, N = 333.2833.041. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff

LZ4 Compression

Compression Level: 1 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Decompression SpeedAMD AOCC 3.2AOCC 3.23K6K9K12K15KSE +/- 60.63, N = 3SE +/- 59.35, N = 315163.715416.81. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedAMD AOCC 3.2AOCC 3.23K6K9K12K15KSE +/- 7.51, N = 3SE +/- 30.17, N = 313906.9213729.371. (CC) gcc options: -O3

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Very FastAMD AOCC 3.2AOCC 3.2510152025SE +/- 0.04, N = 3SE +/- 0.03, N = 319.1719.171. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

Botan

Test: ChaCha20Poly1305 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: ChaCha20Poly1305 - DecryptAMD AOCC 3.2AOCC 3.22004006008001000SE +/- 6.86, N = 3SE +/- 8.98, N = 3896.22885.901. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: ChaCha20Poly1305

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: ChaCha20Poly1305AMD AOCC 3.2AOCC 3.22004006008001000SE +/- 12.67, N = 3SE +/- 12.18, N = 3906.28902.821. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfish - DecryptAMD AOCC 3.2AOCC 3.290180270360450SE +/- 0.05, N = 3SE +/- 0.01, N = 3430.15429.991. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: BlowfishAMD AOCC 3.2AOCC 3.290180270360450SE +/- 0.01, N = 3SE +/- 0.12, N = 3415.01414.601. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - DecryptAMD AOCC 3.2AOCC 3.290180270360450SE +/- 0.14, N = 3SE +/- 0.29, N = 3397.06396.721. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: TwofishAMD AOCC 3.2AOCC 3.280160240320400SE +/- 0.13, N = 3SE +/- 0.27, N = 3368.11366.831. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - DecryptAMD AOCC 3.2AOCC 3.2306090120150SE +/- 0.02, N = 3SE +/- 0.01, N = 3154.61154.601. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256AMD AOCC 3.2AOCC 3.2306090120150SE +/- 0.14, N = 3SE +/- 0.04, N = 3157.58157.751. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: KASUMI - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMI - DecryptAMD AOCC 3.2AOCC 3.220406080100SE +/- 0.04, N = 3SE +/- 0.01, N = 397.1497.171. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: KASUMI

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMIAMD AOCC 3.2AOCC 3.220406080100SE +/- 0.02, N = 3SE +/- 0.01, N = 398.0098.051. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Google SynthMark

Test: VoiceMark_100

OpenBenchmarking.orgVoices, More Is BetterGoogle SynthMark 20201109Test: VoiceMark_100AMD AOCC 3.2AOCC 3.2150300450600750SE +/- 3.97, N = 3SE +/- 0.34, N = 3683.59691.261. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast

Basis Universal

Settings: ETC1S

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: ETC1SAMD AOCC 3.2AOCC 3.2714212835SE +/- 0.15, N = 3SE +/- 0.04, N = 327.9627.471. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2AMD AOCC 3.290180270360450SE +/- 1.83, N = 3395.53MIN: 390.5 / MAX: 402.381. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 2AMD AOCC 3.2AOCC 3.2612182430SE +/- 0.00, N = 3SE +/- 0.01, N = 327.5327.511. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Chia Blockchain VDF

Test: Square Plain C++

OpenBenchmarking.orgIPS, More Is BetterChia Blockchain VDF 1.0.1Test: Square Plain C++AMD AOCC 3.2AOCC 3.240K80K120K160K200KSE +/- 961.48, N = 3SE +/- 560.75, N = 31866671877671. (CXX) g++ options: -flto -fno-PIE -lgmpxx -lgmp -lboost_system -pthread

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.3WAV To FLACAMD AOCC 3.2AOCC 3.248121620SE +/- 0.02, N = 5SE +/- 0.02, N = 515.8715.861. (CXX) g++ options: -O3 -march=native -logg -lm

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21AMD AOCC 3.2AOCC 3.27001400210028003500SE +/- 4.72, N = 3SE +/- 3.76, N = 33208.63212.81. (CXX) g++ options: -O3 -march=native -rdynamic

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.1.0Test: Decompression ThroughputAMD AOCC 3.2AOCC 3.250100150200250SE +/- 0.87, N = 3SE +/- 0.21, N = 3244.12244.701. (CC) gcc options: -O3 -march=native -rdynamic

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Chimera 1080p 10-bitAMD AOCC 3.2AOCC 3.2100200300400500SE +/- 1.84, N = 3SE +/- 1.57, N = 3454.48452.42MIN: 362.96 / MAX: 709.88MIN: 362.02 / MAX: 691.341. (CC) gcc options: -O3 -march=native -pthread -lm

CppPerformanceBenchmarks

Test: Stepanov Abstraction

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov AbstractionAMD AOCC 3.2AOCC 3.2612182430SE +/- 0.00, N = 3SE +/- 0.01, N = 323.6023.601. (CXX) g++ options: -O3 -march=native -std=c++11

JPEG XL Decoding libjxl

CPU Threads: All

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding libjxl 0.6.1CPU Threads: AllAMD AOCC 3.2AOCC 3.270140210280350SE +/- 0.25, N = 3SE +/- 0.40, N = 3298.89307.23

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per SecondAMD AOCC 3.2AOCC 3.280K160K240K320K400KSE +/- 212.94, N = 3SE +/- 457.95, N = 3358684.33358537.891. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUAMD AOCC 3.2AOCC 3.21.08222.16443.24664.32885.411SE +/- 0.00259, N = 3SE +/- 0.01634, N = 34.801364.80995MIN: 4.46MIN: 4.441. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

Etcpak

Configuration: ETC2

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: ETC2AMD AOCC 3.2AOCC 3.250100150200250SE +/- 0.02, N = 3SE +/- 0.64, N = 3231.49230.841. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.71e12 Prime Number GenerationAMD AOCC 3.2AOCC 3.2510152025SE +/- 0.03, N = 3SE +/- 0.05, N = 320.9520.951. (CXX) g++ options: -O3 -march=native -lpthread

dav1d

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Chimera 1080pAMD AOCC 3.2120240360480600SE +/- 0.97, N = 3538.84MIN: 431.65 / MAX: 824.691. (CC) gcc options: -O3 -march=native -pthread -lm

Liquid-DSP

Threads: 4 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 4 - Buffer Length: 256 - Filter Length: 57AMD AOCC 3.2AOCC 3.260M120M180M240M300MSE +/- 1730086.70, N = 3SE +/- 1809570.24, N = 32642200002634133331. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 2 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 2 - Buffer Length: 256 - Filter Length: 57AMD AOCC 3.2AOCC 3.230M60M90M120M150MSE +/- 155241.75, N = 3SE +/- 132035.35, N = 31326600001326900001. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 1 - Buffer Length: 256 - Filter Length: 57AMD AOCC 3.2AOCC 3.214M28M42M56M70MSE +/- 66838.94, N = 3SE +/- 37860.86, N = 366306667663423331. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57AMD AOCC 3.2AOCC 3.2150M300M450M600M750MSE +/- 162583.31, N = 3SE +/- 75498.34, N = 36953400006957400001. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 8 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 8 - Buffer Length: 256 - Filter Length: 57AMD AOCC 3.2AOCC 3.2110M220M330M440M550MSE +/- 652337.68, N = 3SE +/- 1036505.88, N = 35184166675206833331. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Etcpak

Configuration: ETC1

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: ETC1AMD AOCC 3.2AOCC 3.250100150200250SE +/- 0.04, N = 3SE +/- 0.05, N = 3245.96245.741. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Summer Nature 4KAMD AOCC 3.2AOCC 3.24080120160200SE +/- 0.14, N = 3SE +/- 0.23, N = 3200.33199.98MIN: 188.52 / MAX: 227.27MIN: 188.58 / MAX: 226.521. (CC) gcc options: -O3 -march=native -pthread -lm

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Ultra FastAMD AOCC 3.2AOCC 3.2714212835SE +/- 0.10, N = 3SE +/- 0.03, N = 332.2632.171. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

KTX-Software toktx

Settings: Zstd Compression 19

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: Zstd Compression 19AMD AOCC 3.2AOCC 3.248121620SE +/- 0.04, N = 3SE +/- 0.06, N = 317.5617.42

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1AMD AOCC 3.250100150200250SE +/- 0.21, N = 3240.49MIN: 240.08 / MAX: 241.081. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28AMD AOCC 3.2AOCC 3.248121620SE +/- 0.04, N = 3SE +/- 0.01, N = 316.8416.861. (CC) gcc options: -O3 -march=native -pedantic -fvisibility=hidden

KTX-Software toktx

Settings: UASTC 3 + Zstd Compression 19

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: UASTC 3 + Zstd Compression 19AMD AOCC 3.2AOCC 3.248121620SE +/- 0.02, N = 3SE +/- 0.01, N = 316.5716.54

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, LosslessAMD AOCC 3.2AOCC 3.248121620SE +/- 0.09, N = 3SE +/- 0.08, N = 316.3716.361. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff

Etcpak

Configuration: ETC1 + Dithering

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: ETC1 + DitheringAMD AOCC 3.2AOCC 3.260120180240300SE +/- 0.04, N = 3SE +/- 0.15, N = 3296.99297.121. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUAMD AOCC 3.2AOCC 3.20.83881.67762.51643.35524.194SE +/- 0.01119, N = 3SE +/- 0.01100, N = 33.719903.72784MIN: 3.53MIN: 3.471. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

CppPerformanceBenchmarks

Test: Function Objects

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Function ObjectsAMD AOCC 3.2AOCC 3.248121620SE +/- 0.01, N = 3SE +/- 0.13, N = 314.6814.401. (CXX) g++ options: -O3 -march=native -std=c++11

JPEG XL libjxl

Input: JPEG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 7AMD AOCC 3.2AOCC 3.220406080100SE +/- 0.08, N = 3SE +/- 0.41, N = 376.0777.941. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAMD AOCC 3.2AOCC 3.20.27030.54060.81091.08121.3515SE +/- 0.00222, N = 3SE +/- 0.00374, N = 31.198441.20153MIN: 1.06MIN: 1.061. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

KTX-Software toktx

Settings: UASTC 3

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: UASTC 3AMD AOCC 3.2AOCC 3.23691215SE +/- 0.00, N = 3SE +/- 0.00, N = 311.2111.18

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pAMD AOCC 3.2AOCC 3.21326395265SE +/- 0.18, N = 3SE +/- 0.49, N = 357.2057.471. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma

JPEG XL libjxl

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 8AMD AOCC 3.2AOCC 3.2714212835SE +/- 0.07, N = 3SE +/- 0.05, N = 328.6928.931. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUAMD AOCC 3.2AOCC 3.20.42970.85941.28911.71882.1485SE +/- 0.00196, N = 3SE +/- 0.00539, N = 31.903231.90977MIN: 1.64MIN: 1.631. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Summer Nature 1080pAMD AOCC 3.2110220330440550SE +/- 0.93, N = 3504.53MIN: 458.56 / MAX: 543.031. (CC) gcc options: -O3 -march=native -pthread -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3AMD AOCC 3.2AOCC 3.2246810SE +/- 0.002, N = 3SE +/- 0.003, N = 36.9686.9711. (CC) gcc options: -O3 -pipe -march=native -lncurses -lm

Basis Universal

Settings: UASTC Level 0

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 0AMD AOCC 3.2AOCC 3.2246810SE +/- 0.006, N = 3SE +/- 0.002, N = 36.7686.7421. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.4.1Model: LionAMD AOCC 3.211002200330044005500SE +/- 9.61, N = 350341. (CXX) g++ options: -O3 -march=native

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAMD AOCC 3.2AOCC 3.20.8731.7462.6193.4924.365SE +/- 0.00042, N = 3SE +/- 0.00164, N = 33.877133.88008MIN: 3.76MIN: 3.771. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest CompressionAMD AOCC 3.2AOCC 3.21.25032.50063.75095.00126.2515SE +/- 0.000, N = 3SE +/- 0.013, N = 35.5485.5571. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pAMD AOCC 3.2AOCC 3.2306090120150SE +/- 0.20, N = 3SE +/- 0.12, N = 3117.07117.451. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 1080pAOCC 3.2306090120150SE +/- 0.60, N = 3123.711. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: VMAF Optimized - Input: Bosphorus 1080pAMD AOCC 3.2AOCC 3.2306090120150SE +/- 0.23, N = 3SE +/- 0.51, N = 3151.30151.741. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080pAMD AOCC 3.2AOCC 3.24080120160200SE +/- 0.15, N = 3SE +/- 1.08, N = 3157.99158.431. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2AMD AOCC 3.21224364860SE +/- 0.03, N = 355.13MIN: 54.6 / MAX: 55.491. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUAMD AOCC 3.2AOCC 3.2246810SE +/- 0.00595, N = 3SE +/- 0.00652, N = 36.677366.68384MIN: 6.52MIN: 6.521. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pAMD AOCC 3.2AOCC 3.250100150200250SE +/- 0.38, N = 3SE +/- 0.63, N = 3231.07229.831. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

KTX-Software toktx

Settings: Zstd Compression 9

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: Zstd Compression 9AMD AOCC 3.2AOCC 3.20.55061.10121.65182.20242.753SE +/- 0.002, N = 3SE +/- 0.001, N = 32.4472.444

WebP Image Encode

Encode Settings: Quality 100

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100AMD AOCC 3.2AOCC 3.20.43450.8691.30351.7382.1725SE +/- 0.002, N = 3SE +/- 0.004, N = 31.9311.9311. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff

Etcpak

Configuration: DXT1

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: DXT1AMD AOCC 3.2AOCC 3.27001400210028003500SE +/- 6.89, N = 3SE +/- 2.62, N = 33032.893042.991. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

WebP Image Encode

Encode Settings: Default

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: DefaultAMD AOCC 3.2AOCC 3.20.25970.51940.77911.03881.2985SE +/- 0.002, N = 3SE +/- 0.006, N = 31.1501.1541. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff


Phoronix Test Suite v10.8.5