EPYC 7502 AOCC 2.3 Compiler Comparison

AMD EPYC 7502 testing of various benchmarks under AMD AOCC 2.3, GCC 10.2, LLVM Clang 11. CFLAGS/CXXFLAGS of "-O3 -march=znver2" throughout. Benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2012080-HA-EPYC7502A97&sgm=1&swl=1&sor&gru.

EPYC 7502 AOCC 2.3 Compiler ComparisonProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionGCC 10.2LLVM Clang 11AMD AOCC 2.3AMD EPYC 7502 32-Core @ 2.50GHz (32 Cores / 64 Threads)ASRockRack EPYCD8 (P2.10 BIOS)AMD Starship/Matisse126GB280GB INTEL SSDPED1D280GAASPEEDAMD Starship/MatisseVE2282 x Intel I350Ubuntu 20.105.8.0-31-generic (x86_64)GNOME Shell 3.38.1X Server 1.20.9modesetting 1.20.9GCC 10.2.0ext41920x1080Clang 11.0.0-2Target:Clang 11.0.0OpenBenchmarking.orgEnvironment Details- CXXFLAGS="-O3 -march=znver2" CFLAGS="-O3 -march=znver2"Compiler Details- GCC 10.2: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - AMD AOCC 2.3: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver2 Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x830101cSecurity Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

EPYC 7502 AOCC 2.3 Compiler Comparisondav1d: Chimera 1080pdav1d: Summer Nature 4Kdav1d: Summer Nature 1080pdav1d: Chimera 1080p 10-bitsvt-av1: Enc Mode 0 - 1080psvt-av1: Enc Mode 4 - 1080psvt-av1: Enc Mode 8 - 1080psvt-vp9: VMAF Optimized - Bosphorus 1080psvt-vp9: PSNR/SSIM Optimized - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080pvpxenc: Speed 0vpxenc: Speed 5x264: H.264 Video Encodingx265: Bosphorus 4Kx265: Bosphorus 1080pgraphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizingcompress-lz4: 1 - Compression Speedcompress-lz4: 3 - Compression Speedcompress-lz4: 9 - Compression Speedcompress-zstd: 3compress-zstd: 19tjbench: Decompression Throughputscimark2: Compositecryptopp: Unkeyed Algorithmslibraw: Post-Processing Benchmarktscp: AI Chess Performancestockfish: Total Timehint: FLOATredis: LPUSHredis: GETredis: SETnginx: Static Web Page Servingopenssl: RSA 4096-bit Performancedaphne: OpenMP - NDT Mappingdaphne: OpenMP - Points2Imagedaphne: OpenMP - Euclidean Clusterpgbench: 1 - 1 - Read Onlypgbench: 1 - 1 - Read Writepgbench: 1 - 50 - Read Onlypgbench: 1 - 50 - Read Writewebp: Quality 100, Losslesswebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressiononednn: IP Batch 1D - f32 - CPUonednn: IP Batch 1D - u8s8f32 - CPUonednn: IP Batch All - u8s8f32 - CPUonednn: Deconvolution Batch deconv_1d - f32 - CPUonednn: Deconvolution Batch deconv_3d - f32 - CPUonednn: Deconvolution Batch deconv_1d - u8s8f32 - CPUonednn: Deconvolution Batch deconv_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUpgbench: 1 - 1 - Read Only - Average Latencypgbench: 1 - 1 - Read Write - Average Latencypgbench: 1 - 50 - Read Only - Average Latencypgbench: 1 - 50 - Read Write - Average Latencyncnn: CPU - squeezenetncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPU - yolov4-tinytnn: CPU - MobileNet v2tnn: CPU - SqueezeNet v1.1mrbayes: Primate Phylogeny Analysisc-ray: Total Time - 4K, 16 Rays Per Pixelaobench: 2048 x 2048 - Total Timeencode-mp3: WAV To MP3rnnoise: astcenc: Mediumastcenc: Thoroughastcenc: Exhaustivebasis: UASTC Level 2basis: UASTC Level 3basis: UASTC Level 2 + RDO Post-Processingsqlite-speedtest: Timed Time - Size 1,000GCC 10.2LLVM Clang 11AMD AOCC 2.3564.96269.72567.42143.020.1036.77555.713354.98348.24279.736.2719.03149.3522.8349.05129553543459020929459.6945.5744.727849.6109.9171.8887972759.35305.90041052.54100728358347386291925951.642101212068.061809976.831369757.1330658.677395.4874.5418452.623694826890.71289193739521332345320.7998.86144.3831.675031.1755613.87931.956953.680732.179571.94380254.61079.24950.5329440.0350.2680.09614.48817.2919.469.378.8810.598.9511.323.9920.8530.8913.129.4623.7029.47324.172305.13494.28418.92235.7838.79821.7237.009.6173.4616.50725.522755.51775.140572.19275.41584.6592.560.1458.59170.229363.72365.57286.245.9618.23146.4322.4450.02132352738462718279838.5448.4044.307866.2111.2174.9910392673.79314.38991436.98114364262434784292874904.537331304842.232122749.431483322.3130676.685412.5945.3211946.566719313674.01289213772530684344320.7197.65542.3771.722471.0313713.46491.693903.723542.055391.98927172.51364.99120.5758640.0350.2650.09414.52815.3517.856.886.877.356.268.772.8418.8836.6013.1910.8322.2330.70392.204311.68293.84130.64641.65510.07821.5406.058.3267.1716.48525.321837.54177.642575.22274.20588.62122.390.1468.65770.698366.61368.17291.226.6520.12151.9223.6950.60136852531652616539780.3948.5245.767937.8114.6176.8737942779.62312.63784038.47113844262375605294314450.617061380024.291874175.911446872.3531368.865413.5923.6813720.155459109678.51296453865541314341320.9047.54642.7041.408791.0092611.92331.667723.660961.991411.87485147.52330.71820.4642350.0340.2590.09214.65514.3817.025.625.206.255.016.942.1916.5132.0311.998.9419.7029.38365.390304.17890.75933.27537.75811.02620.7565.818.1666.0416.31525.210833.53678.067OpenBenchmarking.org

dav1d

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Chimera 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.2120240360480600SE +/- 0.49, N = 3SE +/- 0.97, N = 3SE +/- 1.05, N = 3575.22572.19564.96MIN: 414.12 / MAX: 729.95MIN: 404.8 / MAX: 726.7MIN: 399.64 / MAX: 724.271. (CC) gcc options: -O3 -march=znver2 -pthread

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Summer Nature 4KLLVM Clang 11AMD AOCC 2.3GCC 10.260120180240300SE +/- 1.08, N = 3SE +/- 1.22, N = 3SE +/- 0.57, N = 3275.41274.20269.72MIN: 155.99 / MAX: 295.35MIN: 151.98 / MAX: 293.44MIN: 160.12 / MAX: 288.91. (CC) gcc options: -O3 -march=znver2 -pthread

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Summer Nature 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.2130260390520650SE +/- 2.09, N = 3SE +/- 0.95, N = 3SE +/- 0.36, N = 3588.62584.65567.42MIN: 345.64 / MAX: 651.08MIN: 337.56 / MAX: 641.35MIN: 337.19 / MAX: 625.711. (CC) gcc options: -O3 -march=znver2 -pthread

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Chimera 1080p 10-bitGCC 10.2AMD AOCC 2.3LLVM Clang 11306090120150SE +/- 0.18, N = 3SE +/- 0.34, N = 3SE +/- 0.26, N = 3143.02122.3992.56MIN: 98.76 / MAX: 246.17MIN: 85.78 / MAX: 202.39MIN: 61.05 / MAX: 158.661. (CC) gcc options: -O3 -march=znver2 -pthread

SVT-AV1

Encoder Mode: Enc Mode 0 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 0 - Input: 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.20.03290.06580.09870.13160.1645SE +/- 0.000, N = 3SE +/- 0.000, N = 3SE +/- 0.000, N = 30.1460.1450.1031. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-AV1

Encoder Mode: Enc Mode 4 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 4 - Input: 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.2246810SE +/- 0.042, N = 3SE +/- 0.026, N = 3SE +/- 0.025, N = 38.6578.5916.7751. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-AV1

Encoder Mode: Enc Mode 8 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 8 - Input: 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.21632486480SE +/- 0.46, N = 3SE +/- 0.30, N = 3SE +/- 0.29, N = 370.7070.2355.711. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: VMAF Optimized - Input: Bosphorus 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.280160240320400SE +/- 1.39, N = 3SE +/- 1.70, N = 3SE +/- 2.15, N = 3366.61363.72354.981. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.280160240320400SE +/- 0.54, N = 3SE +/- 1.74, N = 3SE +/- 1.42, N = 3368.17365.57348.241. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: Visual Quality Optimized - Input: Bosphorus 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.260120180240300SE +/- 0.90, N = 3SE +/- 1.68, N = 3SE +/- 1.18, N = 3291.22286.24279.731. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

VP9 libvpx Encoding

Speed: Speed 0

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 0AMD AOCC 2.3GCC 10.2LLVM Clang 11246810SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.09, N = 36.656.275.961. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver2 -fPIC -U_FORTIFY_SOURCE -std=c++11

VP9 libvpx Encoding

Speed: Speed 5

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 5AMD AOCC 2.3GCC 10.2LLVM Clang 11510152025SE +/- 0.10, N = 3SE +/- 0.04, N = 3SE +/- 0.07, N = 320.1219.0318.231. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver2 -fPIC -U_FORTIFY_SOURCE -std=c++11

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2019-12-17H.264 Video EncodingAMD AOCC 2.3GCC 10.2LLVM Clang 11306090120150SE +/- 0.67, N = 3SE +/- 1.24, N = 3SE +/- 0.59, N = 3151.92149.35146.43-mstack-alignment=64-mstack-alignment=641. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=znver2 -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4KAMD AOCC 2.3GCC 10.2LLVM Clang 11612182430SE +/- 0.05, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 323.6922.8322.441. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread -lrt -ldl -lnuma

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pAMD AOCC 2.3LLVM Clang 11GCC 10.21122334455SE +/- 0.14, N = 3SE +/- 0.09, N = 3SE +/- 0.13, N = 350.6050.0249.051. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread -lrt -ldl -lnuma

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SwirlAMD AOCC 2.3LLVM Clang 11GCC 10.230060090012001500SE +/- 4.26, N = 3SE +/- 2.40, N = 3SE +/- 1.00, N = 31368132312951. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateGCC 10.2LLVM Clang 11AMD AOCC 2.3120240360480600SE +/- 5.24, N = 3SE +/- 1.00, N = 35355275251. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenGCC 10.2LLVM Clang 11AMD AOCC 2.3901802703604504343843161. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedLLVM Clang 11GCC 10.2AMD AOCC 2.31402804205607006275905261. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: ResizingGCC 10.2LLVM Clang 11AMD AOCC 2.3400800120016002000SE +/- 17.89, N = 3SE +/- 10.17, N = 3SE +/- 22.27, N = 32092182716531. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedLLVM Clang 11AMD AOCC 2.3GCC 10.22K4K6K8K10KSE +/- 55.02, N = 3SE +/- 45.16, N = 3SE +/- 56.41, N = 39838.549780.399459.691. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 3 - Compression SpeedAMD AOCC 2.3LLVM Clang 11GCC 10.21122334455SE +/- 0.00, N = 3SE +/- 0.04, N = 3SE +/- 0.66, N = 348.5248.4045.571. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 9 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 9 - Compression SpeedAMD AOCC 2.3GCC 10.2LLVM Clang 111020304050SE +/- 0.03, N = 3SE +/- 0.60, N = 3SE +/- 0.02, N = 345.7644.7244.301. (CC) gcc options: -O3

Zstd Compression

Compression Level: 3

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 3AMD AOCC 2.3LLVM Clang 11GCC 10.22K4K6K8K10KSE +/- 3.73, N = 3SE +/- 30.93, N = 3SE +/- 6.11, N = 37937.87866.27849.61. (CC) gcc options: -O3 -march=znver2 -pthread -lz

Zstd Compression

Compression Level: 19

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 19AMD AOCC 2.3LLVM Clang 11GCC 10.2306090120150SE +/- 0.26, N = 3SE +/- 0.30, N = 3SE +/- 0.23, N = 3114.6111.2109.91. (CC) gcc options: -O3 -march=znver2 -pthread -lz

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.0.2Test: Decompression ThroughputAMD AOCC 2.3LLVM Clang 11GCC 10.24080120160200SE +/- 0.02, N = 3SE +/- 0.21, N = 3SE +/- 0.04, N = 3176.87174.99171.891. (CC) gcc options: -O3 -march=znver2 -rdynamic

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeAMD AOCC 2.3GCC 10.2LLVM Clang 116001200180024003000SE +/- 43.99, N = 3SE +/- 14.97, N = 3SE +/- 7.04, N = 32779.622759.352673.791. (CC) gcc options: -O3 -march=znver2 -lm

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed AlgorithmsLLVM Clang 11AMD AOCC 2.3GCC 10.270140210280350SE +/- 0.20, N = 3SE +/- 0.16, N = 3SE +/- 0.09, N = 3314.39312.64305.901. (CXX) g++ options: -O3 -march=znver2 -fPIC -pthread -pipe

LibRaw

Post-Processing Benchmark

OpenBenchmarking.orgMpix/sec, More Is BetterLibRaw 0.20Post-Processing BenchmarkGCC 10.2AMD AOCC 2.3LLVM Clang 111224364860SE +/- 0.16, N = 3SE +/- 0.10, N = 3SE +/- 0.13, N = 352.5438.4736.981. (CXX) g++ options: -O3 -march=znver2 -fopenmp -ljpeg -lz -lm

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceLLVM Clang 11AMD AOCC 2.3GCC 10.2200K400K600K800K1000KSE +/- 582.00, N = 5SE +/- 471.20, N = 5SE +/- 1467.20, N = 51143642113844210072831. (CC) gcc options: -O3 -march=znver2 -march=native

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 12Total TimeLLVM Clang 11AMD AOCC 2.3GCC 10.213M26M39M52M65MSE +/- 877956.02, N = 4SE +/- 379890.17, N = 3SE +/- 872585.44, N = 3624347846237560558347386-flto=thin-flto=thin-flto -flto=jobserver1. (CXX) g++ options: -m64 -lpthread -O3 -march=znver2 -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2

Hierarchical INTegration

Test: FLOAT

OpenBenchmarking.orgQUIPs, More Is BetterHierarchical INTegration 1.0Test: FLOATAMD AOCC 2.3LLVM Clang 11GCC 10.260M120M180M240M300MSE +/- 30193.74, N = 3SE +/- 170353.62, N = 3SE +/- 15707.07, N = 3294314450.62292874904.54291925951.641. (CC) gcc options: -O3 -march=znver2 -march=native -lm

Redis

Test: LPUSH

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: LPUSHAMD AOCC 2.3LLVM Clang 11GCC 10.2300K600K900K1200K1500KSE +/- 18763.42, N = 3SE +/- 22719.73, N = 15SE +/- 21030.89, N = 151380024.291304842.231212068.061. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: GETLLVM Clang 11AMD AOCC 2.3GCC 10.2500K1000K1500K2000K2500KSE +/- 49885.99, N = 15SE +/- 30693.11, N = 15SE +/- 18838.90, N = 32122749.431874175.911809976.831. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: SETLLVM Clang 11AMD AOCC 2.3GCC 10.2300K600K900K1200K1500KSE +/- 22282.05, N = 15SE +/- 27170.26, N = 15SE +/- 24810.19, N = 151483322.311446872.351369757.131. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

NGINX Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterNGINX Benchmark 1.9.9Static Web Page ServingAMD AOCC 2.3LLVM Clang 11GCC 10.27K14K21K28K35KSE +/- 254.59, N = 15SE +/- 159.74, N = 3SE +/- 381.73, N = 431368.8630676.6830658.671. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native -march=znver2

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit PerformanceGCC 10.2AMD AOCC 2.3LLVM Clang 1116003200480064008000SE +/- 0.73, N = 3SE +/- 0.64, N = 3SE +/- 1.37, N = 37395.45413.55412.5-Qunused-arguments-Qunused-arguments1. (CC) gcc options: -pthread -m64 -O3 -march=znver2 -lssl -lcrypto -ldl

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: NDT Mapping

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: NDT MappingLLVM Clang 11AMD AOCC 2.3GCC 10.22004006008001000SE +/- 2.29, N = 3SE +/- 1.07, N = 3SE +/- 2.97, N = 3945.32923.68874.541. (CXX) g++ options: -O3 -std=c++11 -fopenmp

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: Points2Image

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: Points2ImageGCC 10.2AMD AOCC 2.3LLVM Clang 114K8K12K16K20KSE +/- 140.90, N = 3SE +/- 163.63, N = 6SE +/- 96.60, N = 318452.6213720.1611946.571. (CXX) g++ options: -O3 -std=c++11 -fopenmp

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: Euclidean Cluster

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: Euclidean ClusterGCC 10.2AMD AOCC 2.3LLVM Clang 112004006008001000SE +/- 0.77, N = 3SE +/- 0.47, N = 3SE +/- 2.53, N = 3890.71678.51674.011. (CXX) g++ options: -O3 -std=c++11 -fopenmp

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read OnlyAMD AOCC 2.3LLVM Clang 11GCC 10.26K12K18K24K30KSE +/- 73.04, N = 3SE +/- 252.07, N = 3SE +/- 137.51, N = 32964528921289191. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read WriteAMD AOCC 2.3LLVM Clang 11GCC 10.28001600240032004000SE +/- 18.01, N = 3SE +/- 3.59, N = 3SE +/- 46.08, N = 53865377237391. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read OnlyAMD AOCC 2.3LLVM Clang 11GCC 10.2120K240K360K480K600KSE +/- 798.75, N = 3SE +/- 4278.21, N = 3SE +/- 4440.25, N = 35413145306845213321. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read WriteGCC 10.2LLVM Clang 11AMD AOCC 2.37001400210028003500SE +/- 5.86, N = 3SE +/- 4.43, N = 3SE +/- 5.71, N = 33453344334131. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, LosslessLLVM Clang 11GCC 10.2AMD AOCC 2.3510152025SE +/- 0.02, N = 3SE +/- 0.08, N = 3SE +/- 0.03, N = 320.7220.8020.901. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest CompressionAMD AOCC 2.3LLVM Clang 11GCC 10.2246810SE +/- 0.008, N = 3SE +/- 0.008, N = 3SE +/- 0.003, N = 37.5467.6558.8611. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Lossless, Highest CompressionLLVM Clang 11AMD AOCC 2.3GCC 10.21020304050SE +/- 0.12, N = 3SE +/- 0.01, N = 3SE +/- 0.12, N = 342.3842.7044.381. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

oneDNN

Harness: IP Batch 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.38760.77521.16281.55041.938SE +/- 0.00432, N = 3SE +/- 0.00519, N = 3SE +/- 0.00256, N = 31.408791.675031.72247-fopenmp=libomp - MIN: 1.36-fopenmp - MIN: 1.59-fopenmp=libomp - MIN: 1.621. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPUAMD AOCC 2.3LLVM Clang 11GCC 10.20.26450.5290.79351.0581.3225SE +/- 0.00275, N = 3SE +/- 0.00137, N = 3SE +/- 0.00350, N = 31.009261.031371.17556-fopenmp=libomp - MIN: 0.95-fopenmp=libomp - MIN: 0.97-fopenmp - MIN: 1.131. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPUAMD AOCC 2.3LLVM Clang 11GCC 10.248121620SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.06, N = 311.9213.4613.88-fopenmp=libomp - MIN: 11.65-fopenmp=libomp - MIN: 13.14-fopenmp - MIN: 13.351. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPUAMD AOCC 2.3LLVM Clang 11GCC 10.20.44030.88061.32091.76122.2015SE +/- 0.00186, N = 3SE +/- 0.01023, N = 3SE +/- 0.01336, N = 31.667721.693901.95695-fopenmp=libomp - MIN: 1.61-fopenmp=libomp - MIN: 1.63-fopenmp - MIN: 1.871. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.83781.67562.51343.35124.189SE +/- 0.01101, N = 3SE +/- 0.01624, N = 3SE +/- 0.00775, N = 33.660963.680733.72354-fopenmp=libomp - MIN: 3.49-fopenmp - MIN: 3.53-fopenmp=libomp - MIN: 3.571. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPUAMD AOCC 2.3LLVM Clang 11GCC 10.20.49040.98081.47121.96162.452SE +/- 0.00112, N = 3SE +/- 0.00233, N = 3SE +/- 0.00145, N = 31.991412.055392.17957-fopenmp=libomp - MIN: 1.92-fopenmp=libomp - MIN: 1.96-fopenmp - MIN: 2.051. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.44760.89521.34281.79042.238SE +/- 0.00189, N = 3SE +/- 0.00392, N = 3SE +/- 0.00691, N = 31.874851.943801.98927-fopenmp=libomp - MIN: 1.82-fopenmp - MIN: 1.82-fopenmp=libomp - MIN: 1.91. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAMD AOCC 2.3LLVM Clang 11GCC 10.260120180240300SE +/- 0.06, N = 3SE +/- 1.10, N = 3SE +/- 0.73, N = 3147.52172.51254.61-fopenmp=libomp - MIN: 146.31-fopenmp=libomp - MIN: 169.54-fopenmp - MIN: 251.671. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAMD AOCC 2.3LLVM Clang 11GCC 10.220406080100SE +/- 0.32, N = 3SE +/- 2.37, N = 15SE +/- 0.80, N = 330.7264.9979.25-fopenmp=libomp - MIN: 29.35-fopenmp=libomp - MIN: 50.95-fopenmp - MIN: 77.351. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAMD AOCC 2.3GCC 10.2LLVM Clang 110.12960.25920.38880.51840.648SE +/- 0.000544, N = 3SE +/- 0.001691, N = 3SE +/- 0.001767, N = 30.4642350.5329440.575864-fopenmp=libomp - MIN: 0.45-fopenmp - MIN: 0.51-fopenmp=libomp - MIN: 0.551. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read Only - Average LatencyAMD AOCC 2.3GCC 10.2LLVM Clang 110.00790.01580.02370.03160.0395SE +/- 0.000, N = 3SE +/- 0.000, N = 3SE +/- 0.000, N = 30.0340.0350.0351. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average LatencyAMD AOCC 2.3LLVM Clang 11GCC 10.20.06030.12060.18090.24120.3015SE +/- 0.001, N = 3SE +/- 0.000, N = 3SE +/- 0.003, N = 50.2590.2650.2681. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read Only - Average LatencyAMD AOCC 2.3LLVM Clang 11GCC 10.20.02160.04320.06480.08640.108SE +/- 0.000, N = 3SE +/- 0.001, N = 3SE +/- 0.001, N = 30.0920.0940.0961. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read Write - Average LatencyGCC 10.2LLVM Clang 11AMD AOCC 2.348121620SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 314.4914.5314.661. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

NCNN

Target: CPU - Model: squeezenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: squeezenetAMD AOCC 2.3LLVM Clang 11GCC 10.248121620SE +/- 0.10, N = 3SE +/- 0.14, N = 15SE +/- 0.12, N = 314.3815.3517.29-lomp - MIN: 13.96 / MAX: 16.87-lomp - MIN: 14.29 / MAX: 18.88-lgomp - MIN: 16.89 / MAX: 19.321. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: mobilenetAMD AOCC 2.3LLVM Clang 11GCC 10.2510152025SE +/- 0.34, N = 3SE +/- 0.13, N = 15SE +/- 0.12, N = 317.0217.8519.46-lomp - MIN: 16.39 / MAX: 20.11-lomp - MIN: 16.89 / MAX: 21.04-lgomp - MIN: 18.87 / MAX: 31.761. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU-v2-v2 - Model: mobilenet-v2AMD AOCC 2.3LLVM Clang 11GCC 10.23691215SE +/- 0.08, N = 3SE +/- 0.06, N = 15SE +/- 0.24, N = 35.626.889.37-lomp - MIN: 5.35 / MAX: 7.49-lomp - MIN: 6.35 / MAX: 16.49-lgomp - MIN: 8.62 / MAX: 111. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU-v3-v3 - Model: mobilenet-v3AMD AOCC 2.3LLVM Clang 11GCC 10.2246810SE +/- 0.04, N = 3SE +/- 0.06, N = 15SE +/- 0.11, N = 35.206.878.88-lomp - MIN: 5.05 / MAX: 7.61-lomp - MIN: 6.16 / MAX: 8.64-lgomp - MIN: 8.58 / MAX: 10.831. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: shufflenet-v2AMD AOCC 2.3LLVM Clang 11GCC 10.23691215SE +/- 0.12, N = 3SE +/- 0.02, N = 15SE +/- 0.68, N = 36.257.3510.59-lomp - MIN: 5.96 / MAX: 6.53-lomp - MIN: 7.09 / MAX: 11.01-lgomp - MIN: 9.42 / MAX: 13.721. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: mnasnetAMD AOCC 2.3LLVM Clang 11GCC 10.23691215SE +/- 0.03, N = 3SE +/- 0.09, N = 15SE +/- 0.29, N = 35.016.268.95-lomp - MIN: 4.87 / MAX: 5.39-lomp - MIN: 5.63 / MAX: 8.5-lgomp - MIN: 8.26 / MAX: 11.031. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: efficientnet-b0AMD AOCC 2.3LLVM Clang 11GCC 10.23691215SE +/- 0.05, N = 3SE +/- 0.09, N = 15SE +/- 0.09, N = 36.948.7711.32-lomp - MIN: 6.72 / MAX: 9-lomp - MIN: 7.99 / MAX: 13.6-lgomp - MIN: 11.03 / MAX: 13.161. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: blazefaceAMD AOCC 2.3LLVM Clang 11GCC 10.20.89781.79562.69343.59124.489SE +/- 0.03, N = 3SE +/- 0.02, N = 15SE +/- 0.10, N = 32.192.843.99-lomp - MIN: 2.1 / MAX: 2.43-lomp - MIN: 2.67 / MAX: 4.67-lgomp - MIN: 3.77 / MAX: 4.731. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: googlenetAMD AOCC 2.3LLVM Clang 11GCC 10.2510152025SE +/- 0.05, N = 3SE +/- 0.27, N = 15SE +/- 0.10, N = 316.5118.8820.85-lomp - MIN: 16.26 / MAX: 18.75-lomp - MIN: 16.91 / MAX: 23.31-lgomp - MIN: 19.8 / MAX: 22.841. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: vgg16GCC 10.2AMD AOCC 2.3LLVM Clang 11816243240SE +/- 0.30, N = 3SE +/- 0.31, N = 3SE +/- 0.42, N = 1530.8932.0336.60-lgomp - MIN: 30.04 / MAX: 50.21-lomp - MIN: 30.86 / MAX: 35.2-lomp - MIN: 32.93 / MAX: 48.081. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: resnet18AMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.15, N = 3SE +/- 0.09, N = 3SE +/- 0.12, N = 1511.9913.1213.19-lomp - MIN: 11.71 / MAX: 14.27-lgomp - MIN: 12.79 / MAX: 15.11-lomp - MIN: 12.17 / MAX: 23.531. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: alexnetAMD AOCC 2.3GCC 10.2LLVM Clang 113691215SE +/- 0.01, N = 3SE +/- 0.14, N = 3SE +/- 0.18, N = 158.949.4610.83-lomp - MIN: 8.81 / MAX: 13.49-lgomp - MIN: 9.17 / MAX: 11.49-lomp - MIN: 9.15 / MAX: 60.41. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: resnet50AMD AOCC 2.3LLVM Clang 11GCC 10.2612182430SE +/- 0.18, N = 3SE +/- 0.20, N = 15SE +/- 0.16, N = 319.7022.2323.70-lomp - MIN: 19.16 / MAX: 22.41-lomp - MIN: 20.63 / MAX: 31.79-lgomp - MIN: 23.21 / MAX: 25.681. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: yolov4-tinyAMD AOCC 2.3GCC 10.2LLVM Clang 11714212835SE +/- 0.15, N = 3SE +/- 0.12, N = 3SE +/- 0.19, N = 1529.3829.4730.70-lomp - MIN: 28.77 / MAX: 31.68-lgomp - MIN: 28.89 / MAX: 31.49-lomp - MIN: 29.08 / MAX: 40.61. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.2.3Target: CPU - Model: MobileNet v2GCC 10.2AMD AOCC 2.3LLVM Clang 1190180270360450SE +/- 0.62, N = 3SE +/- 0.24, N = 3SE +/- 0.45, N = 3324.17365.39392.20-fopenmp - MIN: 311.9 / MAX: 354.48-fopenmp=libomp - MIN: 364.57 / MAX: 366.34-fopenmp=libomp - MIN: 390.71 / MAX: 393.871. (CXX) g++ options: -O3 -march=znver2 -pthread -fvisibility=hidden -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.2.3Target: CPU - Model: SqueezeNet v1.1AMD AOCC 2.3GCC 10.2LLVM Clang 1170140210280350SE +/- 0.66, N = 3SE +/- 0.20, N = 3SE +/- 1.92, N = 3304.18305.13311.68-fopenmp=libomp - MIN: 302.78 / MAX: 315.91-fopenmp - MIN: 304.36 / MAX: 306.06-fopenmp=libomp - MIN: 306.99 / MAX: 314.321. (CXX) g++ options: -O3 -march=znver2 -pthread -fvisibility=hidden -rdynamic -ldl

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisAMD AOCC 2.3LLVM Clang 11GCC 10.220406080100SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.17, N = 390.7693.8494.28-mabm1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -O3 -std=c99 -pedantic -march=znver2 -lm

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelGCC 10.2LLVM Clang 11AMD AOCC 2.3816243240SE +/- 0.02, N = 3SE +/- 0.08, N = 3SE +/- 0.06, N = 318.9230.6533.281. (CC) gcc options: -lm -lpthread -O3 -march=znver2

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total TimeGCC 10.2AMD AOCC 2.3LLVM Clang 111020304050SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.02, N = 335.7837.7641.661. (CC) gcc options: -lm -O3 -march=znver2

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3GCC 10.2LLVM Clang 11AMD AOCC 2.33691215SE +/- 0.004, N = 3SE +/- 0.003, N = 3SE +/- 0.004, N = 38.79810.07811.026-ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr1. (CC) gcc options: -O3 -pipe -march=znver2 -lncurses -lm

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28AMD AOCC 2.3LLVM Clang 11GCC 10.2510152025SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 320.7621.5421.721. (CC) gcc options: -O3 -march=znver2 -pedantic -fvisibility=hidden

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: MediumAMD AOCC 2.3LLVM Clang 11GCC 10.2246810SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 35.816.057.001. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: ThoroughAMD AOCC 2.3LLVM Clang 11GCC 10.23691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 38.168.329.611. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: ExhaustiveAMD AOCC 2.3LLVM Clang 11GCC 10.21632486480SE +/- 0.05, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 366.0467.1773.461. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 2AMD AOCC 2.3LLVM Clang 11GCC 10.248121620SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 316.3216.4916.511. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 3AMD AOCC 2.3LLVM Clang 11GCC 10.2612182430SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 325.2125.3225.521. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2 + RDO Post-Processing

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 2 + RDO Post-ProcessingGCC 10.2AMD AOCC 2.3LLVM Clang 112004006008001000SE +/- 0.14, N = 3SE +/- 0.04, N = 3SE +/- 0.24, N = 3755.52833.54837.541. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000GCC 10.2LLVM Clang 11AMD AOCC 2.320406080100SE +/- 0.13, N = 3SE +/- 0.18, N = 3SE +/- 0.09, N = 375.1477.6478.071. (CC) gcc options: -O3 -march=znver2 -ldl -lz -lpthread

Geometric Mean Of All Test Results

Result Composite - EPYC 7502 AOCC 2.3 Compiler Comparison

OpenBenchmarking.orgGeometric Mean, More Is BetterGeometric Mean Of All Test ResultsResult Composite - EPYC 7502 AOCC 2.3 Compiler ComparisonAMD AOCC 2.3LLVM Clang 11GCC 10.2306090120150121.37115.39113.07

Number Of First Place Finishes

Wins - 89 Tests

LLVM Clang 1111 [12.4%]GCC 10.217 [19.1%]AMD AOCC 2.361 [68.5%]Number Of First Place FinishesWins - 89 TestsOpenBenchmarking.org

Number Of Last Place Finishes

Losses - 89 Tests

AMD AOCC 2.310 [11.2%]LLVM Clang 1123 [25.8%]GCC 10.256 [62.9%]Number Of Last Place FinishesLosses - 89 TestsOpenBenchmarking.org


Phoronix Test Suite v10.8.4