Xeon Silver March

Intel Xeon Silver 4216 testing with a TYAN S7100AG2NR (V4.02 BIOS) and ASPEED on Debian 10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2103218-HA-XEONSILVE43&grr&rdt&rro.

Xeon Silver MarchProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen Resolution123Intel Xeon Silver 4216 @ 3.20GHz (16 Cores / 32 Threads)TYAN S7100AG2NR (V4.02 BIOS)Intel Sky Lake-E DMI3 Registers24GB240GB Corsair Force MP500ASPEEDRealtek ALC8922 x Intel I350Debian 104.19.0-9-amd64 (x86_64)GNOME Shell 3.30.2X ServerGCC 8.3.0ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500002cPython Details- 1: Python 2.7.16 + Python 3.7.3Security Details- itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled

Xeon Silver Marchluaradio: Complex Phaseluaradio: Hilbert Transformluaradio: FM Deemphasis Filterluaradio: Five Back to Back FIR Filtersaom-av1: Speed 4 Two-Passmnn: inception-v3mnn: mobilenet-v1-1.0mnn: MobileNetV2_224mnn: resnet-v2-50mnn: SqueezeNetV1.0aom-av1: Speed 0 Two-Passsysbench: CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - f32 - CPUaom-av1: Speed 6 Two-Passonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUsvt-hevc: 1 - Bosphorus 1080ponednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUsimdjson: Kostyaincompact3d: input.i3d 193 Cells Per Directionsimdjson: LargeRandbasis: UASTC Level 3stockfish: Total Timeaom-av1: Speed 6 Realtimesimdjson: DistinctUserIDsimdjson: PartialTweetsbasis: ETC1Sbasis: UASTC Level 2onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUaom-av1: Speed 8 Realtimeincompact3d: input.i3d 129 Cells Per Directiononednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUbasis: UASTC Level 0onednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUsysbench: RAM / Memoryonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUsvt-hevc: 7 - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080psvt-vp9: VMAF Optimized - Bosphorus 1080psvt-vp9: PSNR/SSIM Optimized - Bosphorus 1080ponednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUsvt-hevc: 10 - Bosphorus 1080p123434.155.4254.2808.63.7755.0953.2655.03843.6568.1330.2122943.613589.873589.543586.709.291853.057.881852.811859.121.1366.44664260.3857.1703166503311.761.481.3732.70531.89218.427411.75921.2682730.0616.70265204.7548610.59951.051231.610690.8331923.5630110.0843.932943.124861.4595215699.7116.28106.574686.39957120.65169.62216.44221.5021.76837.650761.88615258.29438.255.4254.0807.83.7754.5473.2275.09543.9858.1300.2022942.833589.973593.323588.589.311853.257.861853.801856.491.1266.66696170.3857.1863108982411.681.481.3732.65731.87118.425711.75491.2712530.0016.84702814.7464010.58771.049211.618660.8344283.5624610.0663.850963.124581.4202115632.4016.29076.516456.21612121.08169.81218.37221.9121.77427.647521.88395257.29439.255.4253.3810.23.7654.0903.2925.09944.2328.2360.2122943.203592.303588.053592.559.311855.597.891852.341854.281.1366.16017400.3857.0643135789611.701.481.3632.61031.86818.436911.78401.2723429.6416.85955434.7404510.59431.04891.604820.8307363.5644610.0723.918523.130521.4402715678.4716.27826.555376.16607120.93170.48219.08222.0521.77747.636461.88619255.87OpenBenchmarking.org

LuaRadio

Test: Complex Phase

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Complex Phase321100200300400500SE +/- 1.11, N = 3SE +/- 3.89, N = 3SE +/- 3.21, N = 3439.2438.2434.1

LuaRadio

Test: Hilbert Transform

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Hilbert Transform3211224364860SE +/- 0.07, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 355.455.455.4

LuaRadio

Test: FM Deemphasis Filter

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: FM Deemphasis Filter32160120180240300SE +/- 0.66, N = 3SE +/- 0.09, N = 3SE +/- 0.03, N = 3253.3254.0254.2

LuaRadio

Test: Five Back to Back FIR Filters

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Five Back to Back FIR Filters3212004006008001000SE +/- 0.70, N = 3SE +/- 2.49, N = 3SE +/- 2.34, N = 3810.2807.8808.6

AOM AV1

Encoder Mode: Speed 4 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 4 Two-Pass3210.84831.69662.54493.39324.2415SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 33.763.773.771. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: inception-v33211224364860SE +/- 0.05, N = 3SE +/- 0.47, N = 3SE +/- 0.48, N = 354.0954.5555.10MIN: 53.71 / MAX: 67.03MIN: 53.67 / MAX: 69.4MIN: 53.83 / MAX: 68.31. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: mobilenet-v1-1.03210.74071.48142.22212.96283.7035SE +/- 0.015, N = 3SE +/- 0.027, N = 3SE +/- 0.044, N = 33.2923.2273.265MIN: 3.15 / MAX: 3.55MIN: 2.92 / MAX: 16.04MIN: 3.07 / MAX: 3.631. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: MobileNetV2_2243211.14732.29463.44194.58925.7365SE +/- 0.012, N = 3SE +/- 0.041, N = 3SE +/- 0.022, N = 35.0995.0955.038MIN: 4.59 / MAX: 10.01MIN: 4.61 / MAX: 8.07MIN: 4.8 / MAX: 5.281. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: resnet-v2-503211020304050SE +/- 0.17, N = 3SE +/- 0.20, N = 3SE +/- 0.14, N = 344.2343.9943.66MIN: 43.21 / MAX: 56.94MIN: 43.06 / MAX: 56.17MIN: 43.28 / MAX: 56.661. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: SqueezeNetV1.0321246810SE +/- 0.068, N = 3SE +/- 0.048, N = 3SE +/- 0.082, N = 38.2368.1308.133MIN: 7.7 / MAX: 9.92MIN: 7.68 / MAX: 12.26MIN: 7.75 / MAX: 14.521. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

AOM AV1

Encoder Mode: Speed 0 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 0 Two-Pass3210.04730.09460.14190.18920.2365SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.210.200.211. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPU3215K10K15K20K25KSE +/- 1.02, N = 3SE +/- 0.76, N = 3SE +/- 0.67, N = 322943.2022942.8322943.611. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU3218001600240032004000SE +/- 5.29, N = 3SE +/- 3.00, N = 3SE +/- 1.80, N = 33592.303589.973589.87MIN: 3584.03MIN: 3582.71MIN: 3583.151. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU3218001600240032004000SE +/- 0.91, N = 3SE +/- 4.71, N = 3SE +/- 2.28, N = 33588.053593.323589.54MIN: 3584.71MIN: 3583.23MIN: 35831. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU3218001600240032004000SE +/- 5.39, N = 3SE +/- 1.40, N = 3SE +/- 1.49, N = 33592.553588.583586.70MIN: 3582.49MIN: 3583.51MIN: 3583.331. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

AOM AV1

Encoder Mode: Speed 6 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 6 Two-Pass3213691215SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 39.319.319.291. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU321400800120016002000SE +/- 4.07, N = 3SE +/- 1.77, N = 3SE +/- 0.80, N = 31855.591853.251853.05MIN: 1848.04MIN: 1848.5MIN: 1849.211. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

SVT-HEVC

Tuning: 1 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080p321246810SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 37.897.867.881. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU321400800120016002000SE +/- 0.88, N = 3SE +/- 1.78, N = 3SE +/- 0.49, N = 31852.341853.801852.81MIN: 1848.76MIN: 1848.59MIN: 1850.491. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU321400800120016002000SE +/- 0.29, N = 3SE +/- 5.14, N = 3SE +/- 1.06, N = 31854.281856.491859.12MIN: 1851.24MIN: 1849.03MIN: 1850.631. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

simdjson

Throughput Test: Kostya

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: Kostya3210.25430.50860.76291.01721.2715SE +/- 0.00, N = 3SE +/- 0.02, N = 4SE +/- 0.00, N = 31.131.121.131. (CXX) g++ options: -O3 -pthread

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per Direction3211530456075SE +/- 0.37, N = 3SE +/- 0.33, N = 3SE +/- 0.08, N = 366.1666.6766.451. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

simdjson

Throughput Test: LargeRandom

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: LargeRandom3210.08550.1710.25650.3420.4275SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.380.380.381. (CXX) g++ options: -O3 -pthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 33211326395265SE +/- 0.06, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 357.0657.1957.171. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total Time3217M14M21M28M35MSE +/- 418440.60, N = 4SE +/- 411515.60, N = 5SE +/- 284219.69, N = 33135789631089824316650331. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver

AOM AV1

Encoder Mode: Speed 6 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 6 Realtime3213691215SE +/- 0.05, N = 3SE +/- 0.06, N = 3SE +/- 0.06, N = 311.7011.6811.761. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: DistinctUserID3210.3330.6660.9991.3321.665SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.481.481.481. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: PartialTweets3210.30830.61660.92491.23321.5415SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.361.371.371. (CXX) g++ options: -O3 -pthread

Basis Universal

Settings: ETC1S

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: ETC1S321816243240SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.02, N = 332.6132.6632.711. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 2321714212835SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 331.8731.8731.891. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU321510152025SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 318.4418.4318.43MIN: 18.28MIN: 18.28MIN: 18.281. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU3213691215SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 311.7811.7511.76MIN: 11.5MIN: 11.36MIN: 11.071. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU3210.28630.57260.85891.14521.4315SE +/- 0.00498, N = 3SE +/- 0.00167, N = 3SE +/- 0.00141, N = 31.272341.271251.26827MIN: 1.26MIN: 1.26MIN: 1.261. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

AOM AV1

Encoder Mode: Speed 8 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 8 Realtime321714212835SE +/- 0.20, N = 3SE +/- 0.25, N = 3SE +/- 0.15, N = 329.6430.0030.061. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Xcompact3d Incompact3d

Input: input.i3d 129 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Direction32148121620SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 316.8616.8516.701. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU3211.06982.13963.20944.27925.349SE +/- 0.00145, N = 3SE +/- 0.00637, N = 3SE +/- 0.00438, N = 34.740454.746404.75486MIN: 4.64MIN: 4.5MIN: 4.621. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU3213691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 310.5910.5910.60MIN: 9.87MIN: 9.92MIN: 9.951. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU3210.23650.4730.70950.9461.1825SE +/- 0.00161, N = 3SE +/- 0.00059, N = 3SE +/- 0.00328, N = 31.048901.049211.05123MIN: 1MIN: 1MIN: 11. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU3210.36420.72841.09261.45681.821SE +/- 0.00500, N = 3SE +/- 0.00101, N = 3SE +/- 0.00563, N = 31.604821.618661.61069MIN: 1.56MIN: 1.57MIN: 1.571. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU3210.18770.37540.56310.75080.9385SE +/- 0.004141, N = 3SE +/- 0.000758, N = 3SE +/- 0.005518, N = 30.8307360.8344280.833192MIN: 0.8MIN: 0.79MIN: 0.81. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU3210.8021.6042.4063.2084.01SE +/- 0.00178, N = 3SE +/- 0.00246, N = 3SE +/- 0.00273, N = 33.564463.562463.56301MIN: 3.52MIN: 3.49MIN: 3.511. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Basis Universal

Settings: UASTC Level 0

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 03213691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 310.0710.0710.081. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU3210.88491.76982.65473.53964.4245SE +/- 0.03733, N = 3SE +/- 0.02483, N = 3SE +/- 0.02492, N = 33.918523.850963.93294MIN: 3.8MIN: 3.76MIN: 3.831. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU3210.70441.40882.11322.81763.522SE +/- 0.00725, N = 3SE +/- 0.00672, N = 3SE +/- 0.01034, N = 33.130523.124583.12486MIN: 3.03MIN: 3.02MIN: 3.021. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU3210.32840.65680.98521.31361.642SE +/- 0.00257, N = 3SE +/- 0.00155, N = 3SE +/- 0.01077, N = 31.440271.420211.45952MIN: 1.39MIN: 1.37MIN: 1.41. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Sysbench

Test: RAM / Memory

OpenBenchmarking.orgMiB/sec, More Is BetterSysbench 1.0.20Test: RAM / Memory3213K6K9K12K15KSE +/- 4.37, N = 3SE +/- 37.93, N = 3SE +/- 44.47, N = 315678.4715632.4015699.711. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU32148121620SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 316.2816.2916.28MIN: 16.26MIN: 16.26MIN: 16.271. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU321246810SE +/- 0.03840, N = 3SE +/- 0.00975, N = 3SE +/- 0.04270, N = 36.555376.516456.57468MIN: 6.44MIN: 6.45MIN: 6.451. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU321246810SE +/- 0.05945, N = 3SE +/- 0.09250, N = 3SE +/- 0.08087, N = 36.166076.216126.39957MIN: 6.04MIN: 6.06MIN: 6.251. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080p321306090120150SE +/- 0.06, N = 3SE +/- 0.10, N = 3SE +/- 0.03, N = 3120.93121.08120.651. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 1080p3214080120160200SE +/- 0.77, N = 3SE +/- 0.35, N = 3SE +/- 0.45, N = 3170.48169.81169.621. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: VMAF Optimized - Input: Bosphorus 1080p32150100150200250SE +/- 1.43, N = 3SE +/- 0.27, N = 3SE +/- 0.35, N = 3219.08218.37216.441. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p32150100150200250SE +/- 0.82, N = 3SE +/- 0.07, N = 3SE +/- 0.92, N = 3222.05221.91221.501. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU321510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 321.7821.7721.77MIN: 21.69MIN: 21.66MIN: 21.681. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU321246810SE +/- 0.00108, N = 3SE +/- 0.00616, N = 3SE +/- 0.01058, N = 37.636467.647527.65076MIN: 7.62MIN: 7.62MIN: 7.621. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU3210.42440.84881.27321.69762.122SE +/- 0.00235, N = 3SE +/- 0.00013, N = 3SE +/- 0.00222, N = 31.886191.883951.88615MIN: 1.88MIN: 1.88MIN: 1.881. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080p32160120180240300SE +/- 0.38, N = 3SE +/- 0.42, N = 3SE +/- 0.64, N = 3255.87257.29258.291. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt


Phoronix Test Suite v10.8.5