march updated tests

tests for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2103148-PTS-XXXMARCH31&gru.

march updated testsProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen Resolution7601 a7601 b7601 c7F52 a2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads)Dell 02MJ3T (1.2.5 BIOS)AMD 17h16 x 32 GB DDR4-2666MT/s 36ASF4G72PZ-2G6D23841GB Micron_9300_MTFDHAL3T8TDP + 11 x 500GB Samsung SSD 860 + 120GB INTEL SSDSCKJB120G7RllvmpipeVE2282 x Broadcom BCM57416 NetXtreme-E Dual-Media 10G RDMA + 2 x Broadcom NetXtreme BCM5720 2-port PCIeUbuntu 20.045.11.0-051100rc6daily20210201-generic (x86_64) 20210131GNOME Shell 3.36.4X Server4.5 Mesa 20.2.6 (LLVM 11.0.0 256 bits)GCC 9.3.0ext41600x12002 x AMD EPYC 7F52 16-Core @ 3.50GHz (32 Cores / 64 Threads)Supermicro H11DSi-NT v2.00 (2.1 BIOS)AMD Starship/Matisse16 x 8192 MB DDR4-3200MT/s HMA81GR7CJR8N-XN3841GB Micron_9300_MTFDHAL3T8TDP2 x Intel 10G X550T1920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- 7601 a: CPU Microcode: 0x8001250- 7601 b: CPU Microcode: 0x8001250- 7601 c: CPU Microcode: 0x8001250- 7F52 a: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Security Details- 7601 a: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected - 7601 b: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected - 7601 c: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected - 7F52 a: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

march updated testsaom-av1: Speed 0 Two-Passaom-av1: Speed 4 Two-Passaom-av1: Speed 6 Realtimeaom-av1: Speed 6 Two-Passaom-av1: Speed 8 Realtimesimdjson: Kostyasimdjson: LargeRandsimdjson: PartialTweetssimdjson: DistinctUserIDastcenc: Mediumastcenc: Thoroughastcenc: Exhaustivebasis: ETC1Sbasis: UASTC Level 0basis: UASTC Level 2basis: UASTC Level 37601 a7601 b7601 c7F52 a0.163.7617.1712.4749.111.760.712.222.456.35734.400328.068337.8809.79415.07520.7950.163.7317.3812.3149.221.750.712.222.446.34714.400728.094438.7989.77415.09720.8600.163.8117.1812.3850.831.760.712.222.456.36444.403928.064937.9219.80515.31720.8460.276.7724.2820.1066.812.440.903.333.795.15879.664732.195030.0947.25013.56820.673OpenBenchmarking.org

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPU7F52 a14K28K42K56K70KSE +/- 21.49, N = 365129.271. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm

AOM AV1

Encoder Mode: Speed 0 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 0 Two-Pass7601 a7601 b7601 c7F52 a0.06080.12160.18240.24320.304SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.160.160.160.271. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 4 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 4 Two-Pass7601 a7601 b7601 c7F52 a246810SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 33.763.733.816.771. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 6 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 6 Realtime7601 a7601 b7601 c7F52 a612182430SE +/- 0.24, N = 3SE +/- 0.11, N = 3SE +/- 0.05, N = 3SE +/- 0.20, N = 317.1717.3817.1824.281. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 6 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 6 Two-Pass7601 a7601 b7601 c7F52 a510152025SE +/- 0.17, N = 3SE +/- 0.06, N = 3SE +/- 0.05, N = 3SE +/- 0.19, N = 312.4712.3112.3820.101. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 8 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 8 Realtime7601 a7601 b7601 c7F52 a1530456075SE +/- 0.10, N = 3SE +/- 0.38, N = 15SE +/- 0.55, N = 15SE +/- 1.03, N = 1549.1149.2250.8366.811. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

simdjson

Throughput Test: Kostya

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: Kostya7601 a7601 b7601 c7F52 a0.5491.0981.6472.1962.745SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 31.761.751.762.441. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: LargeRandom

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: LargeRandom7601 a7601 b7601 c7F52 a0.20250.4050.60750.811.0125SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.710.710.710.901. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: PartialTweets7601 a7601 b7601 c7F52 a0.74931.49862.24792.99723.7465SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 32.222.222.223.331. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: DistinctUserID7601 a7601 b7601 c7F52 a0.85281.70562.55843.41124.264SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 32.452.442.453.791. (CXX) g++ options: -O3 -pthread

Sysbench

Test: RAM / Memory

OpenBenchmarking.orgMiB/sec, More Is BetterSysbench 1.0.20Test: RAM / Memory7F52 a400800120016002000SE +/- 25.05, N = 31904.401. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU7F52 a0.36170.72341.08511.44681.8085SE +/- 0.02169, N = 151.60749MIN: 1.311. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU7F52 a0.48780.97561.46341.95122.439SE +/- 0.01231, N = 32.16809MIN: 1.861. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU7F52 a0.33220.66440.99661.32881.661SE +/- 0.02758, N = 151.47660MIN: 1.161. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU7F52 a0.13850.2770.41550.5540.6925SE +/- 0.008864, N = 150.615501MIN: 0.511. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU7F52 a0.23950.4790.71850.9581.1975SE +/- 0.01517, N = 31.06434MIN: 0.971. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU7F52 a246810SE +/- 0.06367, N = 56.15122MIN: 5.081. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU7F52 a0.75891.51782.27673.03563.7945SE +/- 0.05194, N = 153.37275MIN: 2.671. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU7F52 a1.33042.66083.99125.32166.652SE +/- 0.69542, N = 155.91286MIN: 1.531. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU7F52 a0.36440.72881.09321.45761.822SE +/- 0.02349, N = 131.61951MIN: 1.271. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU7F52 a0.37570.75141.12711.50281.8785SE +/- 0.00276, N = 31.66979MIN: 1.641. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU7F52 a5001000150020002500SE +/- 31.41, N = 152287.04MIN: 1975.841. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU7F52 a2004006008001000SE +/- 44.71, N = 151027.87MIN: 757.71. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU7F52 a5001000150020002500SE +/- 27.12, N = 122219.13MIN: 1994.181. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU7F52 a2004006008001000SE +/- 44.21, N = 151005.38MIN: 748.391. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU7F52 a0.12130.24260.36390.48520.6065SE +/- 0.006739, N = 150.539208MIN: 0.431. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU7F52 a5001000150020002500SE +/- 29.79, N = 152262.94MIN: 1987.211. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU7F52 a2004006008001000SE +/- 43.87, N = 15970.31MIN: 748.991. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU7F52 a0.22250.4450.66750.891.1125SE +/- 0.003400, N = 30.988770MIN: 0.941. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.4Preset: Medium7601 a7601 b7601 c7F52 a246810SE +/- 0.0277, N = 3SE +/- 0.0183, N = 3SE +/- 0.0304, N = 3SE +/- 0.0048, N = 36.35736.34716.36445.15871. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.4Preset: Thorough7601 a7601 b7601 c7F52 a3691215SE +/- 0.0021, N = 3SE +/- 0.0005, N = 3SE +/- 0.0018, N = 3SE +/- 0.0086, N = 34.40034.40074.40399.66471. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.4Preset: Exhaustive7601 a7601 b7601 c7F52 a714212835SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.04, N = 328.0728.0928.0632.201. (CXX) g++ options: -O3 -flto -pthread

Basis Universal

Settings: ETC1S

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: ETC1S7601 a7601 b7601 c7F52 a918273645SE +/- 0.12, N = 3SE +/- 0.52, N = 3SE +/- 0.27, N = 3SE +/- 0.30, N = 337.8838.8037.9230.091. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 0

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 07601 a7601 b7601 c7F52 a3691215SE +/- 0.053, N = 3SE +/- 0.032, N = 3SE +/- 0.009, N = 3SE +/- 0.009, N = 39.7949.7749.8057.2501. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 27601 a7601 b7601 c7F52 a48121620SE +/- 0.09, N = 3SE +/- 0.09, N = 3SE +/- 0.04, N = 3SE +/- 0.03, N = 315.0815.1015.3213.571. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 37601 a7601 b7601 c7F52 a510152025SE +/- 0.04, N = 3SE +/- 0.09, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 320.8020.8620.8520.671. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread


Phoronix Test Suite v10.8.4