AMD AOCC 3.1 Compiler Benchmarking

AMD EPYC 7543 testing of AMD AOCC 3.1 compiler benchmarks by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/2107228-IB-AOCC31BEN66&rdt.

AMD AOCC 3.1 Compiler BenchmarkingProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionAOCC 3.1AOCC 3.0AMD EPYC 7543 32-Core @ 2.80GHz (32 Cores / 64 Threads)TYAN S8036GM2NE-LE (V2.00.B21 BIOS)AMD Starship/Matisse64GB1000GB Corsair Force MP600ASPEEDVE2282 x Broadcom NetXtreme BCM5720 2-port PCIeUbuntu 21.045.11.0-25-generic (x86_64)GNOME Shell 3.38.4X ServerClang 12.0.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=znver3" CFLAGS="-O3 -march=znver3"Compiler Details- AOCC 3.1: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3 - AOCC 3.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: (unknown) Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119 Python Details- Python 3.9.5Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

AMD AOCC 3.1 Compiler Benchmarkingblosc: blosclzquantlib: etcpak: DXT1etcpak: ETC2compress-lz4: 1 - Compression Speedcompress-zstd: 8 - Compression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedjpegxl: PNG - 7jpegxl: JPEG - 7jpegxl: JPEG - 8botan: AES-256botan: AES-256 - Decryptbotan: Twofishbotan: Twofish - Decryptbotan: Blowfishbotan: Blowfish - Decryptbotan: CAST-256botan: CAST-256 - Decryptlibraw: Post-Processing Benchmarkjohn-the-ripper: Blowfishjohn-the-ripper: MD5graphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Noise-Gaussiangraphics-magick: HWB Color Spacesvt-av1: Preset 4 - Bosphorus 4Ksvt-av1: Preset 8 - Bosphorus 4Ksvt-hevc: 7 - Bosphorus 1080psvt-hevc: 10 - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080pvpxenc: Speed 5 - Bosphorus 4Khimeno: Poisson Pressure Solverstockfish: Total Timeavifenc: 2avifenc: 6avifenc: 10avifenc: 6, Losslessavifenc: 10, Losslesspovray: Trace Timeonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUencode-flac: WAV To FLACencode-mp3: WAV To MP3ngspice: C2670ngspice: C7552rnnoise: tachyon: Total Timesynthmark: VoiceMark_100securemark: SecureMark-TLSliquid-dsp: 16 - 256 - 57liquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 57financebench: Repo OpenMPfinancebench: Bonds OpenMPtjbench: Decompression Throughputastcenc: Exhaustivesqlite-speedtest: Timed Time - Size 1,000draco: Liondraco: Church Facadencnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - vgg16ncnn: CPU - resnet18tnn: CPU - SqueezeNet v2tnn: CPU - SqueezeNet v1.1rocksdb: Update Randrocksdb: Read While Writingrocksdb: Read Rand Write Randonnx: bertsquad-10 - OpenMP CPUonnx: fcn-resnet101-11 - OpenMP CPUonnx: super-resolution-10 - OpenMP CPUencode-wavpack: WAV To WavPackgnupg: 2.7GB Sample File EncryptionAOCC 3.1AOCC 3.025069.62854.92897.284207.55010964.112852.442.13264.49.0465.8928.304859.6604876.955332.808339.974417.045405.311134.367137.90745.116184222400007923576964206771.98719.467303.75408.51234.1815.623819.9854469044803024.7679.6723.49726.6665.80014.7791.401645.075471.406741.800963.131866432.49668.7890.4359998.7007.82891.52282.27918.73328.6622621.6292799318065100001527933333189496666733763.57812551185.485677220.67107921.893855.235496064885.745.6051.6011.8860.655266.2418326516135560305906848790456813.17567.60925045.52857.62799.898188.81510359.142752.342.43270.09.0164.2327.444869.9204866.532332.451339.970417.290404.413134.375137.88444.54618932237667792357553422667292.33399.61234.6615.613779.4782308795229024.5639.7313.54226.7255.84217.0141.395524.979052.117151.797193.085896386.24665.9740.4343938.6977.87792.52583.12018.65828.8937622.2272754887974766671507633333189183333333706.02343851304.153646222.01488321.9517530370445.795.7361.7912.2159.511262.3568201886048186297082947486475913.16367.741OpenBenchmarking.org

C-Blosc

Compressor: blosclz

OpenBenchmarking.orgMB/s, More Is BetterC-Blosc 2.0Compressor: blosclzAOCC 3.1AOCC 3.05K10K15K20K25KSE +/- 100.31, N = 3SE +/- 83.54, N = 325069.625045.51. (CXX) g++ options: -O3 -march=znver3

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21AOCC 3.1AOCC 3.06001200180024003000SE +/- 5.98, N = 3SE +/- 3.82, N = 32854.92857.61. (CXX) g++ options: -O3 -march=native -rdynamic

Etcpak

Configuration: DXT1

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: DXT1AOCC 3.1AOCC 3.06001200180024003000SE +/- 4.90, N = 3SE +/- 10.86, N = 32897.282799.901. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

Etcpak

Configuration: ETC2

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: ETC2AOCC 3.1AOCC 3.050100150200250SE +/- 0.03, N = 3SE +/- 0.03, N = 3207.55188.821. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedAOCC 3.1AOCC 3.02K4K6K8K10KSE +/- 7.15, N = 3SE +/- 12.79, N = 310964.1110359.141. (CC) gcc options: -O3

Zstd Compression

Compression Level: 8 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Compression SpeedAOCC 3.1AOCC 3.06001200180024003000SE +/- 17.26, N = 3SE +/- 25.08, N = 72852.42752.31. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression SpeedAOCC 3.1AOCC 3.01020304050SE +/- 0.38, N = 3SE +/- 0.52, N = 442.142.41. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression SpeedAOCC 3.1AOCC 3.07001400210028003500SE +/- 6.78, N = 3SE +/- 5.68, N = 43264.43270.01. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

JPEG XL

Input: PNG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: PNG - Encode Speed: 7AOCC 3.1AOCC 3.03691215SE +/- 0.01, N = 3SE +/- 0.04, N = 39.049.011. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: JPEG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 7AOCC 3.1AOCC 3.01530456075SE +/- 0.18, N = 3SE +/- 0.27, N = 365.8964.231. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 8AOCC 3.1AOCC 3.0714212835SE +/- 0.04, N = 3SE +/- 0.10, N = 328.3027.441. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

Botan

Test: AES-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256AOCC 3.1AOCC 3.010002000300040005000SE +/- 3.96, N = 3SE +/- 1.46, N = 34859.664869.921. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: AES-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256 - DecryptAOCC 3.1AOCC 3.010002000300040005000SE +/- 6.85, N = 3SE +/- 4.76, N = 34876.964866.531. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: TwofishAOCC 3.1AOCC 3.070140210280350SE +/- 0.05, N = 3SE +/- 0.03, N = 3332.81332.451. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - DecryptAOCC 3.1AOCC 3.070140210280350SE +/- 0.13, N = 3SE +/- 0.05, N = 3339.97339.971. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: BlowfishAOCC 3.1AOCC 3.090180270360450SE +/- 0.17, N = 3SE +/- 0.20, N = 3417.05417.291. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfish - DecryptAOCC 3.1AOCC 3.090180270360450SE +/- 0.11, N = 3SE +/- 0.14, N = 3405.31404.411. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256AOCC 3.1AOCC 3.0306090120150SE +/- 0.02, N = 3SE +/- 0.02, N = 3134.37134.381. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - DecryptAOCC 3.1AOCC 3.0306090120150SE +/- 0.02, N = 3SE +/- 0.01, N = 3137.91137.881. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

LibRaw

Post-Processing Benchmark

OpenBenchmarking.orgMpix/sec, More Is BetterLibRaw 0.20Post-Processing BenchmarkAOCC 3.1AOCC 3.01020304050SE +/- 0.48, N = 3SE +/- 0.22, N = 345.1144.541. (CXX) g++ options: -O3 -march=znver3 -fopenmp -ljpeg -lz -lm

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: BlowfishAOCC 3.1AOCC 3.013K26K39K52K65KSE +/- 125.66, N = 3SE +/- 194.70, N = 361842618931. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: MD5AOCC 3.1AOCC 3.0500K1000K1500K2000K2500KSE +/- 23259.41, N = 3SE +/- 8875.68, N = 3224000022376671. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateAOCC 3.1AOCC 3.02004006008001000SE +/- 3.71, N = 3SE +/- 5.24, N = 37927921. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenAOCC 3.1AOCC 3.080160240320400SE +/- 0.33, N = 33573571. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedAOCC 3.1AOCC 3.0150300450600750SE +/- 0.33, N = 3SE +/- 0.67, N = 36965531. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-GaussianAOCC 3.1AOCC 3.090180270360450SE +/- 1.53, N = 3SE +/- 0.67, N = 34204221. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color SpaceAOCC 3.1AOCC 3.0150300450600750SE +/- 1.73, N = 3SE +/- 1.73, N = 36776671. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 4 - Input: Bosphorus 4KAOCC 3.10.44710.89421.34131.78842.2355SE +/- 0.001, N = 31.9871. (CXX) g++ options: -O3 -march=znver3 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 8 - Input: Bosphorus 4KAOCC 3.1510152025SE +/- 0.10, N = 319.471. (CXX) g++ options: -O3 -march=znver3 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pAOCC 3.1AOCC 3.070140210280350SE +/- 4.16, N = 3SE +/- 3.36, N = 3303.75292.331. (CC) gcc options: -O3 -march=znver3 -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pAOCC 3.1AOCC 3.090180270360450SE +/- 5.79, N = 15SE +/- 5.37, N = 3408.51399.611. (CC) gcc options: -O3 -march=znver3 -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 1080pAOCC 3.1AOCC 3.050100150200250SE +/- 1.81, N = 3SE +/- 0.29, N = 3234.18234.661. (CC) gcc options: -O3 -fcommon -march=znver3 -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

VP9 libvpx Encoding

Speed: Speed 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 4KAOCC 3.1AOCC 3.048121620SE +/- 0.12, N = 3SE +/- 0.15, N = 615.6215.611. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverAOCC 3.1AOCC 3.08001600240032004000SE +/- 26.55, N = 3SE +/- 45.95, N = 153819.993779.481. (CC) gcc options: -O3 -march=znver3 -mavx2

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total TimeAOCC 3.1AOCC 3.020M40M60M80M100MSE +/- 655436.02, N = 15SE +/- 682844.99, N = 1090448030879522901. (CXX) g++ options: -fprofile-use -m64 -lpthread -O3 -march=znver3 -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -flto

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 2AOCC 3.1AOCC 3.0612182430SE +/- 0.25, N = 3SE +/- 0.10, N = 324.7724.561. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 6AOCC 3.1AOCC 3.03691215SE +/- 0.067, N = 3SE +/- 0.040, N = 39.6729.7311. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 10

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 10AOCC 3.1AOCC 3.00.7971.5942.3913.1883.985SE +/- 0.022, N = 3SE +/- 0.020, N = 33.4973.5421. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 6, LosslessAOCC 3.1AOCC 3.0612182430SE +/- 0.14, N = 3SE +/- 0.25, N = 326.6726.731. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 10, LosslessAOCC 3.1AOCC 3.01.31452.6293.94355.2586.5725SE +/- 0.053, N = 3SE +/- 0.032, N = 35.8005.8421. (CXX) g++ options: -O3 -fPIC -lm

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-Ray 3.7.0.7Trace TimeAOCC 3.1AOCC 3.048121620SE +/- 0.04, N = 3SE +/- 0.03, N = 314.7817.011. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -march=znver3 -pthread -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUAOCC 3.1AOCC 3.00.31540.63080.94621.26161.577SE +/- 0.00055, N = 3SE +/- 0.00351, N = 31.401641.39552MIN: 1.27MIN: 1.271. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUAOCC 3.1AOCC 3.01.1422.2843.4264.5685.71SE +/- 0.00433, N = 3SE +/- 0.00392, N = 35.075474.97905MIN: 4.96MIN: 4.881. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAOCC 3.1AOCC 3.00.47640.95281.42921.90562.382SE +/- 0.01062, N = 3SE +/- 0.01448, N = 131.406742.11715MIN: 1.25MIN: 1.891. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUAOCC 3.1AOCC 3.00.40520.81041.21561.62082.026SE +/- 0.00353, N = 3SE +/- 0.00278, N = 31.800961.79719MIN: 1.57MIN: 1.571. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUAOCC 3.1AOCC 3.00.70471.40942.11412.81883.5235SE +/- 0.00672, N = 3SE +/- 0.00522, N = 33.131863.08589MIN: 2.76MIN: 2.781. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAOCC 3.1AOCC 3.014002800420056007000SE +/- 68.78, N = 3SE +/- 48.40, N = 156432.496386.24MIN: 6265.76MIN: 5873.511. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAOCC 3.1AOCC 3.0140280420560700SE +/- 0.23, N = 3SE +/- 0.22, N = 3668.79665.97MIN: 659.59MIN: 659.91. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAOCC 3.1AOCC 3.00.09810.19620.29430.39240.4905SE +/- 0.000575, N = 3SE +/- 0.000098, N = 30.4359990.434393MIN: 0.39MIN: 0.41. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLACAOCC 3.1AOCC 3.0246810SE +/- 0.002, N = 5SE +/- 0.004, N = 58.7008.6971. (CXX) g++ options: -O3 -march=znver3 -logg -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3AOCC 3.1AOCC 3.0246810SE +/- 0.003, N = 3SE +/- 0.031, N = 37.8287.8771. (CC) gcc options: -O3 -pipe -march=znver3 -lm

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670AOCC 3.1AOCC 3.020406080100SE +/- 0.15, N = 3SE +/- 0.11, N = 391.5292.531. (CC) gcc options: -O3 -march=znver3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552AOCC 3.1AOCC 3.020406080100SE +/- 0.29, N = 3SE +/- 0.33, N = 382.2883.121. (CC) gcc options: -O3 -march=znver3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28AOCC 3.1AOCC 3.0510152025SE +/- 0.05, N = 3SE +/- 0.02, N = 318.7318.661. (CC) gcc options: -O3 -march=znver3 -pedantic -fvisibility=hidden

Tachyon

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.99b6Total TimeAOCC 3.1AOCC 3.0714212835SE +/- 0.03, N = 3SE +/- 0.20, N = 328.6628.891. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread

Google SynthMark

Test: VoiceMark_100

OpenBenchmarking.orgVoices, More Is BetterGoogle SynthMark 20201109Test: VoiceMark_100AOCC 3.1AOCC 3.0130260390520650SE +/- 0.49, N = 3SE +/- 0.20, N = 3621.63622.231. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSAOCC 3.1AOCC 3.060K120K180K240K300KSE +/- 250.58, N = 3SE +/- 1063.58, N = 32799312754881. (CC) gcc options: -pedantic -O3

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57AOCC 3.1AOCC 3.0200M400M600M800M1000MSE +/- 596852.86, N = 3SE +/- 710969.60, N = 38065100007974766671. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57AOCC 3.1AOCC 3.0300M600M900M1200M1500MSE +/- 635959.47, N = 3SE +/- 696020.43, N = 3152793333315076333331. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 64 - Buffer Length: 256 - Filter Length: 57AOCC 3.1AOCC 3.0400M800M1200M1600M2000MSE +/- 983756.97, N = 3SE +/- 520683.31, N = 3189496666718918333331. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

FinanceBench

Benchmark: Repo OpenMP

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Repo OpenMPAOCC 3.1AOCC 3.07K14K21K28K35KSE +/- 308.02, N = 3SE +/- 124.42, N = 333763.5833706.021. (CXX) g++ options: -O3 -march=native -fopenmp

FinanceBench

Benchmark: Bonds OpenMP

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Bonds OpenMPAOCC 3.1AOCC 3.011K22K33K44K55KSE +/- 150.62, N = 3SE +/- 198.20, N = 351185.4951304.151. (CXX) g++ options: -O3 -march=native -fopenmp

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.1.0Test: Decompression ThroughputAOCC 3.1AOCC 3.050100150200250SE +/- 0.05, N = 3SE +/- 0.39, N = 3220.67222.011. (CC) gcc options: -O3 -march=znver3 -rdynamic

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.0Preset: ExhaustiveAOCC 3.1AOCC 3.0510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 321.8921.951. (CXX) g++ options: -O3 -march=znver3 -flto -pthread

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000AOCC 3.11224364860SE +/- 0.06, N = 355.241. (CC) gcc options: -O3 -march=znver3 -ldl -lz -lpthread

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.4.1Model: LionAOCC 3.1AOCC 3.011002200330044005500SE +/- 25.93, N = 3SE +/- 3.84, N = 3496053031. (CXX) g++ options: -O3 -march=znver3

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.4.1Model: Church FacadeAOCC 3.1AOCC 3.015003000450060007500SE +/- 49.64, N = 14SE +/- 39.87, N = 3648870441. (CXX) g++ options: -O3 -march=znver3

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU-v2-v2 - Model: mobilenet-v2AOCC 3.1AOCC 3.01.30282.60563.90845.21126.514SE +/- 0.03, N = 15SE +/- 0.05, N = 155.745.79MIN: 5.3 / MAX: 32.98MIN: 5.47 / MAX: 7.261. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU-v3-v3 - Model: mobilenet-v3AOCC 3.1AOCC 3.01.28932.57863.86795.15726.4465SE +/- 0.04, N = 15SE +/- 0.11, N = 155.605.73MIN: 4.66 / MAX: 56.3MIN: 4.91 / MAX: 48.441. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU - Model: vgg16AOCC 3.1AOCC 3.01428425670SE +/- 1.18, N = 15SE +/- 2.57, N = 1551.6061.79MIN: 29.8 / MAX: 198.51MIN: 44.7 / MAX: 201.091. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU - Model: resnet18AOCC 3.1AOCC 3.03691215SE +/- 0.09, N = 15SE +/- 0.12, N = 1511.8812.21MIN: 10.11 / MAX: 110.24MIN: 11.3 / MAX: 22.591. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lomp -lpthread -pthread

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2AOCC 3.1AOCC 3.01428425670SE +/- 0.03, N = 3SE +/- 0.19, N = 360.6659.51MIN: 60.48 / MAX: 60.97MIN: 59.13 / MAX: 60.161. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1AOCC 3.1AOCC 3.060120180240300SE +/- 0.08, N = 3SE +/- 0.07, N = 3266.24262.36MIN: 265.84 / MAX: 266.54MIN: 262.07 / MAX: 262.651. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

Facebook RocksDB

Test: Update Random

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Update RandomAOCC 3.1AOCC 3.0200K400K600K800K1000KSE +/- 1446.17, N = 3SE +/- 3760.31, N = 38326518201881. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -latomic -lpthread

Facebook RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Read While WritingAOCC 3.1AOCC 3.01.3M2.6M3.9M5.2M6.5MSE +/- 55702.89, N = 3SE +/- 39263.64, N = 15613556060481861. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -latomic -lpthread

Facebook RocksDB

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Read Random Write RandomAOCC 3.1AOCC 3.0700K1400K2100K2800K3500KSE +/- 32512.94, N = 4SE +/- 4965.23, N = 3305906829708291. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -latomic -lpthread

ONNX Runtime

Model: bertsquad-10 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: bertsquad-10 - Device: OpenMP CPUAOCC 3.1AOCC 3.0110220330440550SE +/- 4.81, N = 5SE +/- 6.29, N = 34874741. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: fcn-resnet101-11 - Device: OpenMP CPUAOCC 3.1AOCC 3.020406080100SE +/- 1.74, N = 11SE +/- 0.44, N = 390861. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: super-resolution-10 - Device: OpenMP CPUAOCC 3.1AOCC 3.010002000300040005000SE +/- 49.29, N = 5SE +/- 11.37, N = 3456847591. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

WavPack Audio Encoding

WAV To WavPack

OpenBenchmarking.orgSeconds, Fewer Is BetterWavPack Audio Encoding 5.3WAV To WavPackAOCC 3.1AOCC 3.03691215SE +/- 0.00, N = 5SE +/- 0.00, N = 513.1813.161. (CXX) g++ options: -O3 -march=znver3 -rdynamic

GnuPG

2.7GB Sample File Encryption

OpenBenchmarking.orgSeconds, Fewer Is BetterGnuPG 2.2.272.7GB Sample File EncryptionAOCC 3.1AOCC 3.01530456075SE +/- 0.24, N = 3SE +/- 0.37, N = 367.6167.741. (CC) gcc options: -O3 -march=znver3


Phoronix Test Suite v10.8.4