EPYC 7502 AOCC 2.3 Compiler Comparison

AMD EPYC 7502 testing of various benchmarks under AMD AOCC 2.3, GCC 10.2, LLVM Clang 11. CFLAGS/CXXFLAGS of "-O3 -march=znver2" throughout. Benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2012080-HA-EPYC7502A97&sgm=1&swl=1&grw&sro&rro.

EPYC 7502 AOCC 2.3 Compiler ComparisonProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionGCC 10.2LLVM Clang 11AMD AOCC 2.3AMD EPYC 7502 32-Core @ 2.50GHz (32 Cores / 64 Threads)ASRockRack EPYCD8 (P2.10 BIOS)AMD Starship/Matisse126GB280GB INTEL SSDPED1D280GAASPEEDAMD Starship/MatisseVE2282 x Intel I350Ubuntu 20.105.8.0-31-generic (x86_64)GNOME Shell 3.38.1X Server 1.20.9modesetting 1.20.9GCC 10.2.0ext41920x1080Clang 11.0.0-2Target:Clang 11.0.0OpenBenchmarking.orgEnvironment Details- CXXFLAGS="-O3 -march=znver2" CFLAGS="-O3 -march=znver2"Compiler Details- GCC 10.2: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - AMD AOCC 2.3: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver2 Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x830101cSecurity Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

EPYC 7502 AOCC 2.3 Compiler Comparisontscp: AI Chess Performancecryptopp: Unkeyed Algorithmsscimark2: Compositecompress-lz4: 1 - Compression Speedcompress-lz4: 3 - Compression Speedcompress-lz4: 9 - Compression Speedhint: FLOATbasis: UASTC Level 2basis: UASTC Level 3basis: UASTC Level 2 + RDO Post-Processingencode-mp3: WAV To MP3tjbench: Decompression Throughputastcenc: Mediumastcenc: Thoroughastcenc: Exhaustivelibraw: Post-Processing Benchmarkwebp: Quality 100, Losslesswebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressiondaphne: OpenMP - NDT Mappingdaphne: OpenMP - Points2Imagedaphne: OpenMP - Euclidean Clustermrbayes: Primate Phylogeny Analysisrnnoise: tnn: CPU - MobileNet v2tnn: CPU - SqueezeNet v1.1ncnn: CPU - squeezenetncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPU - yolov4-tinyonednn: IP Batch 1D - f32 - CPUonednn: IP Batch 1D - u8s8f32 - CPUonednn: IP Batch All - u8s8f32 - CPUonednn: Deconvolution Batch deconv_1d - f32 - CPUonednn: Deconvolution Batch deconv_3d - f32 - CPUonednn: Deconvolution Batch deconv_1d - u8s8f32 - CPUonednn: Deconvolution Batch deconv_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUstockfish: Total Timecompress-zstd: 3compress-zstd: 19aobench: 2048 x 2048 - Total Timevpxenc: Speed 0vpxenc: Speed 5graphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizingsvt-vp9: VMAF Optimized - Bosphorus 1080psvt-vp9: PSNR/SSIM Optimized - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080px264: H.264 Video Encodingdav1d: Chimera 1080pdav1d: Summer Nature 4Kdav1d: Summer Nature 1080pdav1d: Chimera 1080p 10-bitsvt-av1: Enc Mode 0 - 1080psvt-av1: Enc Mode 4 - 1080psvt-av1: Enc Mode 8 - 1080px265: Bosphorus 4Kx265: Bosphorus 1080pc-ray: Total Time - 4K, 16 Rays Per Pixelnginx: Static Web Page Servingopenssl: RSA 4096-bit Performanceredis: LPUSHredis: GETredis: SETsqlite-speedtest: Timed Time - Size 1,000pgbench: 1 - 1 - Read Onlypgbench: 1 - 1 - Read Only - Average Latencypgbench: 1 - 1 - Read Writepgbench: 1 - 1 - Read Write - Average Latencypgbench: 1 - 50 - Read Onlypgbench: 1 - 50 - Read Only - Average Latencypgbench: 1 - 50 - Read Writepgbench: 1 - 50 - Read Write - Average LatencyGCC 10.2LLVM Clang 11AMD AOCC 2.31007283305.9004102759.359459.6945.5744.72291925951.6421016.50725.522755.5178.798171.8887977.009.6173.4652.5420.7998.86144.383874.5418452.623694826890.7194.28421.723324.172305.13417.2919.469.378.8810.598.9511.323.9920.8530.8913.129.4623.7029.471.675031.1755613.87931.956953.680732.179571.94380254.61079.24950.532944583473867849.6109.935.7836.2719.0312955354345902092354.98348.24279.73149.35564.96269.72567.42143.020.1036.77555.71322.8349.0518.92230658.677395.41212068.061809976.831369757.1375.140289190.03537390.2685213320.096345314.4881143642314.3899142673.799838.5448.4044.30292874904.5373316.48525.321837.54110.078174.9910396.058.3267.1736.9820.7197.65542.377945.3211946.566719313674.0193.84121.540392.204311.68215.3517.856.886.877.356.268.772.8418.8836.6013.1910.8322.2330.701.722471.0313713.46491.693903.723542.055391.98927172.51364.99120.575864624347847866.2111.241.6555.9618.2313235273846271827363.72365.57286.24146.43572.19275.41584.6592.560.1458.59170.22922.4450.0230.64630676.685412.51304842.232122749.431483322.3177.642289210.03537720.2655306840.094344314.5281138442312.6378402779.629780.3948.5245.76294314450.6170616.31525.210833.53611.026176.8737945.818.1666.0438.4720.9047.54642.704923.6813720.155459109678.5190.75920.756365.390304.17814.3817.025.625.206.255.016.942.1916.5132.0311.998.9419.7029.381.408791.0092611.92331.667723.660961.991411.87485147.52330.71820.464235623756057937.8114.637.7586.6520.1213685253165261653366.61368.17291.22151.92575.22274.20588.62122.390.1468.65770.69823.6950.6033.27531368.865413.51380024.291874175.911446872.3578.067296450.03438650.2595413140.092341314.655OpenBenchmarking.org

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceLLVM Clang 11GCC 10.2AMD AOCC 2.3200K400K600K800K1000KSE +/- 582.00, N = 5SE +/- 1467.20, N = 5SE +/- 471.20, N = 51143642100728311384421. (CC) gcc options: -O3 -march=znver2 -march=native

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed AlgorithmsLLVM Clang 11GCC 10.2AMD AOCC 2.370140210280350SE +/- 0.20, N = 3SE +/- 0.09, N = 3SE +/- 0.16, N = 3314.39305.90312.641. (CXX) g++ options: -O3 -march=znver2 -fPIC -pthread -pipe

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeLLVM Clang 11GCC 10.2AMD AOCC 2.36001200180024003000SE +/- 7.04, N = 3SE +/- 14.97, N = 3SE +/- 43.99, N = 32673.792759.352779.621. (CC) gcc options: -O3 -march=znver2 -lm

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedLLVM Clang 11GCC 10.2AMD AOCC 2.32K4K6K8K10KSE +/- 55.02, N = 3SE +/- 56.41, N = 3SE +/- 45.16, N = 39838.549459.699780.391. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 3 - Compression SpeedLLVM Clang 11GCC 10.2AMD AOCC 2.31122334455SE +/- 0.04, N = 3SE +/- 0.66, N = 3SE +/- 0.00, N = 348.4045.5748.521. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 9 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 9 - Compression SpeedLLVM Clang 11GCC 10.2AMD AOCC 2.31020304050SE +/- 0.02, N = 3SE +/- 0.60, N = 3SE +/- 0.03, N = 344.3044.7245.761. (CC) gcc options: -O3

Hierarchical INTegration

Test: FLOAT

OpenBenchmarking.orgQUIPs, More Is BetterHierarchical INTegration 1.0Test: FLOATLLVM Clang 11GCC 10.2AMD AOCC 2.360M120M180M240M300MSE +/- 170353.62, N = 3SE +/- 15707.07, N = 3SE +/- 30193.74, N = 3292874904.54291925951.64294314450.621. (CC) gcc options: -O3 -march=znver2 -march=native -lm

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 2LLVM Clang 11GCC 10.2AMD AOCC 2.348121620SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 316.4916.5116.321. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 3LLVM Clang 11GCC 10.2AMD AOCC 2.3612182430SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 325.3225.5225.211. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2 + RDO Post-Processing

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.12Settings: UASTC Level 2 + RDO Post-ProcessingLLVM Clang 11GCC 10.2AMD AOCC 2.32004006008001000SE +/- 0.24, N = 3SE +/- 0.14, N = 3SE +/- 0.04, N = 3837.54755.52833.541. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3LLVM Clang 11GCC 10.2AMD AOCC 2.33691215SE +/- 0.003, N = 3SE +/- 0.004, N = 3SE +/- 0.004, N = 310.0788.79811.026-ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr1. (CC) gcc options: -O3 -pipe -march=znver2 -lncurses -lm

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.0.2Test: Decompression ThroughputLLVM Clang 11GCC 10.2AMD AOCC 2.34080120160200SE +/- 0.21, N = 3SE +/- 0.04, N = 3SE +/- 0.02, N = 3174.99171.89176.871. (CC) gcc options: -O3 -march=znver2 -rdynamic

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: MediumLLVM Clang 11GCC 10.2AMD AOCC 2.3246810SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 36.057.005.811. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: ThoroughLLVM Clang 11GCC 10.2AMD AOCC 2.33691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 38.329.618.161. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.0Preset: ExhaustiveLLVM Clang 11GCC 10.2AMD AOCC 2.31632486480SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.05, N = 367.1773.4666.041. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread

LibRaw

Post-Processing Benchmark

OpenBenchmarking.orgMpix/sec, More Is BetterLibRaw 0.20Post-Processing BenchmarkLLVM Clang 11GCC 10.2AMD AOCC 2.31224364860SE +/- 0.13, N = 3SE +/- 0.16, N = 3SE +/- 0.10, N = 336.9852.5438.471. (CXX) g++ options: -O3 -march=znver2 -fopenmp -ljpeg -lz -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, LosslessLLVM Clang 11GCC 10.2AMD AOCC 2.3510152025SE +/- 0.02, N = 3SE +/- 0.08, N = 3SE +/- 0.03, N = 320.7220.8020.901. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest CompressionLLVM Clang 11GCC 10.2AMD AOCC 2.3246810SE +/- 0.008, N = 3SE +/- 0.003, N = 3SE +/- 0.008, N = 37.6558.8617.5461. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Lossless, Highest CompressionLLVM Clang 11GCC 10.2AMD AOCC 2.31020304050SE +/- 0.12, N = 3SE +/- 0.12, N = 3SE +/- 0.01, N = 342.3844.3842.701. (CC) gcc options: -fvisibility=hidden -O3 -march=znver2 -pthread -lm -ljpeg

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: NDT Mapping

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: NDT MappingLLVM Clang 11GCC 10.2AMD AOCC 2.32004006008001000SE +/- 2.29, N = 3SE +/- 2.97, N = 3SE +/- 1.07, N = 3945.32874.54923.681. (CXX) g++ options: -O3 -std=c++11 -fopenmp

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: Points2Image

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: Points2ImageLLVM Clang 11GCC 10.2AMD AOCC 2.34K8K12K16K20KSE +/- 96.60, N = 3SE +/- 140.90, N = 3SE +/- 163.63, N = 611946.5718452.6213720.161. (CXX) g++ options: -O3 -std=c++11 -fopenmp

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: Euclidean Cluster

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: Euclidean ClusterLLVM Clang 11GCC 10.2AMD AOCC 2.32004006008001000SE +/- 2.53, N = 3SE +/- 0.77, N = 3SE +/- 0.47, N = 3674.01890.71678.511. (CXX) g++ options: -O3 -std=c++11 -fopenmp

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisLLVM Clang 11GCC 10.2AMD AOCC 2.320406080100SE +/- 0.04, N = 3SE +/- 0.17, N = 3SE +/- 0.04, N = 393.8494.2890.76-mabm1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -O3 -std=c99 -pedantic -march=znver2 -lm

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28LLVM Clang 11GCC 10.2AMD AOCC 2.3510152025SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 321.5421.7220.761. (CC) gcc options: -O3 -march=znver2 -pedantic -fvisibility=hidden

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.2.3Target: CPU - Model: MobileNet v2LLVM Clang 11GCC 10.2AMD AOCC 2.390180270360450SE +/- 0.45, N = 3SE +/- 0.62, N = 3SE +/- 0.24, N = 3392.20324.17365.39-fopenmp=libomp - MIN: 390.71 / MAX: 393.87-fopenmp - MIN: 311.9 / MAX: 354.48-fopenmp=libomp - MIN: 364.57 / MAX: 366.341. (CXX) g++ options: -O3 -march=znver2 -pthread -fvisibility=hidden -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.2.3Target: CPU - Model: SqueezeNet v1.1LLVM Clang 11GCC 10.2AMD AOCC 2.370140210280350SE +/- 1.92, N = 3SE +/- 0.20, N = 3SE +/- 0.66, N = 3311.68305.13304.18-fopenmp=libomp - MIN: 306.99 / MAX: 314.32-fopenmp - MIN: 304.36 / MAX: 306.06-fopenmp=libomp - MIN: 302.78 / MAX: 315.911. (CXX) g++ options: -O3 -march=znver2 -pthread -fvisibility=hidden -rdynamic -ldl

NCNN

Target: CPU - Model: squeezenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: squeezenetLLVM Clang 11GCC 10.2AMD AOCC 2.348121620SE +/- 0.14, N = 15SE +/- 0.12, N = 3SE +/- 0.10, N = 315.3517.2914.38-lomp - MIN: 14.29 / MAX: 18.88-lgomp - MIN: 16.89 / MAX: 19.32-lomp - MIN: 13.96 / MAX: 16.871. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: mobilenetLLVM Clang 11GCC 10.2AMD AOCC 2.3510152025SE +/- 0.13, N = 15SE +/- 0.12, N = 3SE +/- 0.34, N = 317.8519.4617.02-lomp - MIN: 16.89 / MAX: 21.04-lgomp - MIN: 18.87 / MAX: 31.76-lomp - MIN: 16.39 / MAX: 20.111. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU-v2-v2 - Model: mobilenet-v2LLVM Clang 11GCC 10.2AMD AOCC 2.33691215SE +/- 0.06, N = 15SE +/- 0.24, N = 3SE +/- 0.08, N = 36.889.375.62-lomp - MIN: 6.35 / MAX: 16.49-lgomp - MIN: 8.62 / MAX: 11-lomp - MIN: 5.35 / MAX: 7.491. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU-v3-v3 - Model: mobilenet-v3LLVM Clang 11GCC 10.2AMD AOCC 2.3246810SE +/- 0.06, N = 15SE +/- 0.11, N = 3SE +/- 0.04, N = 36.878.885.20-lomp - MIN: 6.16 / MAX: 8.64-lgomp - MIN: 8.58 / MAX: 10.83-lomp - MIN: 5.05 / MAX: 7.611. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: shufflenet-v2LLVM Clang 11GCC 10.2AMD AOCC 2.33691215SE +/- 0.02, N = 15SE +/- 0.68, N = 3SE +/- 0.12, N = 37.3510.596.25-lomp - MIN: 7.09 / MAX: 11.01-lgomp - MIN: 9.42 / MAX: 13.72-lomp - MIN: 5.96 / MAX: 6.531. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: mnasnetLLVM Clang 11GCC 10.2AMD AOCC 2.33691215SE +/- 0.09, N = 15SE +/- 0.29, N = 3SE +/- 0.03, N = 36.268.955.01-lomp - MIN: 5.63 / MAX: 8.5-lgomp - MIN: 8.26 / MAX: 11.03-lomp - MIN: 4.87 / MAX: 5.391. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: efficientnet-b0LLVM Clang 11GCC 10.2AMD AOCC 2.33691215SE +/- 0.09, N = 15SE +/- 0.09, N = 3SE +/- 0.05, N = 38.7711.326.94-lomp - MIN: 7.99 / MAX: 13.6-lgomp - MIN: 11.03 / MAX: 13.16-lomp - MIN: 6.72 / MAX: 91. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: blazefaceLLVM Clang 11GCC 10.2AMD AOCC 2.30.89781.79562.69343.59124.489SE +/- 0.02, N = 15SE +/- 0.10, N = 3SE +/- 0.03, N = 32.843.992.19-lomp - MIN: 2.67 / MAX: 4.67-lgomp - MIN: 3.77 / MAX: 4.73-lomp - MIN: 2.1 / MAX: 2.431. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: googlenetLLVM Clang 11GCC 10.2AMD AOCC 2.3510152025SE +/- 0.27, N = 15SE +/- 0.10, N = 3SE +/- 0.05, N = 318.8820.8516.51-lomp - MIN: 16.91 / MAX: 23.31-lgomp - MIN: 19.8 / MAX: 22.84-lomp - MIN: 16.26 / MAX: 18.751. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: vgg16LLVM Clang 11GCC 10.2AMD AOCC 2.3816243240SE +/- 0.42, N = 15SE +/- 0.30, N = 3SE +/- 0.31, N = 336.6030.8932.03-lomp - MIN: 32.93 / MAX: 48.08-lgomp - MIN: 30.04 / MAX: 50.21-lomp - MIN: 30.86 / MAX: 35.21. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: resnet18LLVM Clang 11GCC 10.2AMD AOCC 2.33691215SE +/- 0.12, N = 15SE +/- 0.09, N = 3SE +/- 0.15, N = 313.1913.1211.99-lomp - MIN: 12.17 / MAX: 23.53-lgomp - MIN: 12.79 / MAX: 15.11-lomp - MIN: 11.71 / MAX: 14.271. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: alexnetLLVM Clang 11GCC 10.2AMD AOCC 2.33691215SE +/- 0.18, N = 15SE +/- 0.14, N = 3SE +/- 0.01, N = 310.839.468.94-lomp - MIN: 9.15 / MAX: 60.4-lgomp - MIN: 9.17 / MAX: 11.49-lomp - MIN: 8.81 / MAX: 13.491. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: resnet50LLVM Clang 11GCC 10.2AMD AOCC 2.3612182430SE +/- 0.20, N = 15SE +/- 0.16, N = 3SE +/- 0.18, N = 322.2323.7019.70-lomp - MIN: 20.63 / MAX: 31.79-lgomp - MIN: 23.21 / MAX: 25.68-lomp - MIN: 19.16 / MAX: 22.411. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: CPU - Model: yolov4-tinyLLVM Clang 11GCC 10.2AMD AOCC 2.3714212835SE +/- 0.19, N = 15SE +/- 0.12, N = 3SE +/- 0.15, N = 330.7029.4729.38-lomp - MIN: 29.08 / MAX: 40.6-lgomp - MIN: 28.89 / MAX: 31.49-lomp - MIN: 28.77 / MAX: 31.681. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread

oneDNN

Harness: IP Batch 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.30.38760.77521.16281.55041.938SE +/- 0.00256, N = 3SE +/- 0.00519, N = 3SE +/- 0.00432, N = 31.722471.675031.40879-fopenmp=libomp - MIN: 1.62-fopenmp - MIN: 1.59-fopenmp=libomp - MIN: 1.361. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.30.26450.5290.79351.0581.3225SE +/- 0.00137, N = 3SE +/- 0.00350, N = 3SE +/- 0.00275, N = 31.031371.175561.00926-fopenmp=libomp - MIN: 0.97-fopenmp - MIN: 1.13-fopenmp=libomp - MIN: 0.951. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.348121620SE +/- 0.04, N = 3SE +/- 0.06, N = 3SE +/- 0.02, N = 313.4613.8811.92-fopenmp=libomp - MIN: 13.14-fopenmp - MIN: 13.35-fopenmp=libomp - MIN: 11.651. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.30.44030.88061.32091.76122.2015SE +/- 0.01023, N = 3SE +/- 0.01336, N = 3SE +/- 0.00186, N = 31.693901.956951.66772-fopenmp=libomp - MIN: 1.63-fopenmp - MIN: 1.87-fopenmp=libomp - MIN: 1.611. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.30.83781.67562.51343.35124.189SE +/- 0.00775, N = 3SE +/- 0.01624, N = 3SE +/- 0.01101, N = 33.723543.680733.66096-fopenmp=libomp - MIN: 3.57-fopenmp - MIN: 3.53-fopenmp=libomp - MIN: 3.491. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.30.49040.98081.47121.96162.452SE +/- 0.00233, N = 3SE +/- 0.00145, N = 3SE +/- 0.00112, N = 32.055392.179571.99141-fopenmp=libomp - MIN: 1.96-fopenmp - MIN: 2.05-fopenmp=libomp - MIN: 1.921. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.30.44760.89521.34281.79042.238SE +/- 0.00691, N = 3SE +/- 0.00392, N = 3SE +/- 0.00189, N = 31.989271.943801.87485-fopenmp=libomp - MIN: 1.9-fopenmp - MIN: 1.82-fopenmp=libomp - MIN: 1.821. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.360120180240300SE +/- 1.10, N = 3SE +/- 0.73, N = 3SE +/- 0.06, N = 3172.51254.61147.52-fopenmp=libomp - MIN: 169.54-fopenmp - MIN: 251.67-fopenmp=libomp - MIN: 146.311. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.320406080100SE +/- 2.37, N = 15SE +/- 0.80, N = 3SE +/- 0.32, N = 364.9979.2530.72-fopenmp=libomp - MIN: 50.95-fopenmp - MIN: 77.35-fopenmp=libomp - MIN: 29.351. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPULLVM Clang 11GCC 10.2AMD AOCC 2.30.12960.25920.38880.51840.648SE +/- 0.001767, N = 3SE +/- 0.001691, N = 3SE +/- 0.000544, N = 30.5758640.5329440.464235-fopenmp=libomp - MIN: 0.55-fopenmp - MIN: 0.51-fopenmp=libomp - MIN: 0.451. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 12Total TimeLLVM Clang 11GCC 10.2AMD AOCC 2.313M26M39M52M65MSE +/- 877956.02, N = 4SE +/- 872585.44, N = 3SE +/- 379890.17, N = 3624347845834738662375605-flto=thin-flto -flto=jobserver-flto=thin1. (CXX) g++ options: -m64 -lpthread -O3 -march=znver2 -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2

Zstd Compression

Compression Level: 3

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 3LLVM Clang 11GCC 10.2AMD AOCC 2.32K4K6K8K10KSE +/- 30.93, N = 3SE +/- 6.11, N = 3SE +/- 3.73, N = 37866.27849.67937.81. (CC) gcc options: -O3 -march=znver2 -pthread -lz

Zstd Compression

Compression Level: 19

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 19LLVM Clang 11GCC 10.2AMD AOCC 2.3306090120150SE +/- 0.30, N = 3SE +/- 0.23, N = 3SE +/- 0.26, N = 3111.2109.9114.61. (CC) gcc options: -O3 -march=znver2 -pthread -lz

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total TimeLLVM Clang 11GCC 10.2AMD AOCC 2.31020304050SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 341.6635.7837.761. (CC) gcc options: -lm -O3 -march=znver2

VP9 libvpx Encoding

Speed: Speed 0

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 0LLVM Clang 11GCC 10.2AMD AOCC 2.3246810SE +/- 0.09, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 35.966.276.651. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver2 -fPIC -U_FORTIFY_SOURCE -std=c++11

VP9 libvpx Encoding

Speed: Speed 5

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 5LLVM Clang 11GCC 10.2AMD AOCC 2.3510152025SE +/- 0.07, N = 3SE +/- 0.04, N = 3SE +/- 0.10, N = 318.2319.0320.121. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver2 -fPIC -U_FORTIFY_SOURCE -std=c++11

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SwirlLLVM Clang 11GCC 10.2AMD AOCC 2.330060090012001500SE +/- 2.40, N = 3SE +/- 1.00, N = 3SE +/- 4.26, N = 31323129513681. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateLLVM Clang 11GCC 10.2AMD AOCC 2.3120240360480600SE +/- 5.24, N = 3SE +/- 1.00, N = 35275355251. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenLLVM Clang 11GCC 10.2AMD AOCC 2.3901802703604503844343161. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedLLVM Clang 11GCC 10.2AMD AOCC 2.31402804205607006275905261. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: ResizingLLVM Clang 11GCC 10.2AMD AOCC 2.3400800120016002000SE +/- 10.17, N = 3SE +/- 17.89, N = 3SE +/- 22.27, N = 31827209216531. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljpeg -lX11 -lz -lm -lpthread

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: VMAF Optimized - Input: Bosphorus 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.380160240320400SE +/- 1.70, N = 3SE +/- 2.15, N = 3SE +/- 1.39, N = 3363.72354.98366.611. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.380160240320400SE +/- 1.74, N = 3SE +/- 1.42, N = 3SE +/- 0.54, N = 3365.57348.24368.171. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.1Tuning: Visual Quality Optimized - Input: Bosphorus 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.360120180240300SE +/- 1.68, N = 3SE +/- 1.18, N = 3SE +/- 0.90, N = 3286.24279.73291.221. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2019-12-17H.264 Video EncodingLLVM Clang 11GCC 10.2AMD AOCC 2.3306090120150SE +/- 0.59, N = 3SE +/- 1.24, N = 3SE +/- 0.67, N = 3146.43149.35151.92-mstack-alignment=64-mstack-alignment=641. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=znver2 -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize

dav1d

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Chimera 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.3120240360480600SE +/- 0.97, N = 3SE +/- 1.05, N = 3SE +/- 0.49, N = 3572.19564.96575.22MIN: 404.8 / MAX: 726.7MIN: 399.64 / MAX: 724.27MIN: 414.12 / MAX: 729.951. (CC) gcc options: -O3 -march=znver2 -pthread

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Summer Nature 4KLLVM Clang 11GCC 10.2AMD AOCC 2.360120180240300SE +/- 1.08, N = 3SE +/- 0.57, N = 3SE +/- 1.22, N = 3275.41269.72274.20MIN: 155.99 / MAX: 295.35MIN: 160.12 / MAX: 288.9MIN: 151.98 / MAX: 293.441. (CC) gcc options: -O3 -march=znver2 -pthread

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Summer Nature 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.3130260390520650SE +/- 0.95, N = 3SE +/- 0.36, N = 3SE +/- 2.09, N = 3584.65567.42588.62MIN: 337.56 / MAX: 641.35MIN: 337.19 / MAX: 625.71MIN: 345.64 / MAX: 651.081. (CC) gcc options: -O3 -march=znver2 -pthread

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.7.0Video Input: Chimera 1080p 10-bitLLVM Clang 11GCC 10.2AMD AOCC 2.3306090120150SE +/- 0.26, N = 3SE +/- 0.18, N = 3SE +/- 0.34, N = 392.56143.02122.39MIN: 61.05 / MAX: 158.66MIN: 98.76 / MAX: 246.17MIN: 85.78 / MAX: 202.391. (CC) gcc options: -O3 -march=znver2 -pthread

SVT-AV1

Encoder Mode: Enc Mode 0 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 0 - Input: 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.30.03290.06580.09870.13160.1645SE +/- 0.000, N = 3SE +/- 0.000, N = 3SE +/- 0.000, N = 30.1450.1030.1461. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-AV1

Encoder Mode: Enc Mode 4 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 4 - Input: 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.3246810SE +/- 0.026, N = 3SE +/- 0.025, N = 3SE +/- 0.042, N = 38.5916.7758.6571. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-AV1

Encoder Mode: Enc Mode 8 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 8 - Input: 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.31632486480SE +/- 0.30, N = 3SE +/- 0.29, N = 3SE +/- 0.46, N = 370.2355.7170.701. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4KLLVM Clang 11GCC 10.2AMD AOCC 2.3612182430SE +/- 0.03, N = 3SE +/- 0.06, N = 3SE +/- 0.05, N = 322.4422.8323.691. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread -lrt -ldl -lnuma

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pLLVM Clang 11GCC 10.2AMD AOCC 2.31122334455SE +/- 0.09, N = 3SE +/- 0.13, N = 3SE +/- 0.14, N = 350.0249.0550.601. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread -lrt -ldl -lnuma

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelLLVM Clang 11GCC 10.2AMD AOCC 2.3816243240SE +/- 0.08, N = 3SE +/- 0.02, N = 3SE +/- 0.06, N = 330.6518.9233.281. (CC) gcc options: -lm -lpthread -O3 -march=znver2

NGINX Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterNGINX Benchmark 1.9.9Static Web Page ServingLLVM Clang 11GCC 10.2AMD AOCC 2.37K14K21K28K35KSE +/- 159.74, N = 3SE +/- 381.73, N = 4SE +/- 254.59, N = 1530676.6830658.6731368.861. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native -march=znver2

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit PerformanceLLVM Clang 11GCC 10.2AMD AOCC 2.316003200480064008000SE +/- 1.37, N = 3SE +/- 0.73, N = 3SE +/- 0.64, N = 35412.57395.45413.5-Qunused-arguments-Qunused-arguments1. (CC) gcc options: -pthread -m64 -O3 -march=znver2 -lssl -lcrypto -ldl

Redis

Test: LPUSH

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: LPUSHLLVM Clang 11GCC 10.2AMD AOCC 2.3300K600K900K1200K1500KSE +/- 22719.73, N = 15SE +/- 21030.89, N = 15SE +/- 18763.42, N = 31304842.231212068.061380024.291. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: GETLLVM Clang 11GCC 10.2AMD AOCC 2.3500K1000K1500K2000K2500KSE +/- 49885.99, N = 15SE +/- 18838.90, N = 3SE +/- 30693.11, N = 152122749.431809976.831874175.911. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: SETLLVM Clang 11GCC 10.2AMD AOCC 2.3300K600K900K1200K1500KSE +/- 22282.05, N = 15SE +/- 24810.19, N = 15SE +/- 27170.26, N = 151483322.311369757.131446872.351. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3 -march=znver2

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000LLVM Clang 11GCC 10.2AMD AOCC 2.320406080100SE +/- 0.18, N = 3SE +/- 0.13, N = 3SE +/- 0.09, N = 377.6475.1478.071. (CC) gcc options: -O3 -march=znver2 -ldl -lz -lpthread

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read OnlyLLVM Clang 11GCC 10.2AMD AOCC 2.36K12K18K24K30KSE +/- 252.07, N = 3SE +/- 137.51, N = 3SE +/- 73.04, N = 32892128919296451. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read Only - Average LatencyLLVM Clang 11GCC 10.2AMD AOCC 2.30.00790.01580.02370.03160.0395SE +/- 0.000, N = 3SE +/- 0.000, N = 3SE +/- 0.000, N = 30.0350.0350.0341. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read WriteLLVM Clang 11GCC 10.2AMD AOCC 2.38001600240032004000SE +/- 3.59, N = 3SE +/- 46.08, N = 5SE +/- 18.01, N = 33772373938651. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average LatencyLLVM Clang 11GCC 10.2AMD AOCC 2.30.06030.12060.18090.24120.3015SE +/- 0.000, N = 3SE +/- 0.003, N = 5SE +/- 0.001, N = 30.2650.2680.2591. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read OnlyLLVM Clang 11GCC 10.2AMD AOCC 2.3120K240K360K480K600KSE +/- 4278.21, N = 3SE +/- 4440.25, N = 3SE +/- 798.75, N = 35306845213325413141. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read Only - Average LatencyLLVM Clang 11GCC 10.2AMD AOCC 2.30.02160.04320.06480.08640.108SE +/- 0.001, N = 3SE +/- 0.001, N = 3SE +/- 0.000, N = 30.0940.0960.0921. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read WriteLLVM Clang 11GCC 10.2AMD AOCC 2.37001400210028003500SE +/- 4.43, N = 3SE +/- 5.86, N = 3SE +/- 5.71, N = 33443345334131. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

PostgreSQL pgbench

Scaling Factor: 1 - Clients: 50 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 13.0Scaling Factor: 1 - Clients: 50 - Mode: Read Write - Average LatencyLLVM Clang 11GCC 10.2AMD AOCC 2.348121620SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 314.5314.4914.661. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm

Geometric Mean Of All Test Results

Result Composite - EPYC 7502 AOCC 2.3 Compiler Comparison

OpenBenchmarking.orgGeometric Mean, More Is BetterGeometric Mean Of All Test ResultsResult Composite - EPYC 7502 AOCC 2.3 Compiler ComparisonLLVM Clang 11GCC 10.2AMD AOCC 2.3306090120150115.39113.07121.37

Number Of First Place Finishes

Wins - 89 Tests

LLVM Clang 1111 [12.4%]GCC 10.217 [19.1%]AMD AOCC 2.361 [68.5%]Number Of First Place FinishesWins - 89 TestsOpenBenchmarking.org

Number Of Last Place Finishes

Losses - 89 Tests

LLVM Clang 1123 [25.8%]GCC 10.256 [62.9%]AMD AOCC 2.310 [11.2%]Number Of Last Place FinishesLosses - 89 TestsOpenBenchmarking.org


Phoronix Test Suite v10.8.5