Xeon Silver March

Intel Xeon Silver 4216 testing with a TYAN S7100AG2NR (V4.02 BIOS) and ASPEED on Debian 10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2103218-HA-XEONSILVE43&grt&sor.

Xeon Silver MarchProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen Resolution123Intel Xeon Silver 4216 @ 3.20GHz (16 Cores / 32 Threads)TYAN S7100AG2NR (V4.02 BIOS)Intel Sky Lake-E DMI3 Registers24GB240GB Corsair Force MP500ASPEEDRealtek ALC8922 x Intel I350Debian 104.19.0-9-amd64 (x86_64)GNOME Shell 3.30.2X ServerGCC 8.3.0ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500002cPython Details- 1: Python 2.7.16 + Python 3.7.3Security Details- itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled

Xeon Silver Marchaom-av1: Speed 0 Two-Passaom-av1: Speed 4 Two-Passaom-av1: Speed 6 Realtimeaom-av1: Speed 6 Two-Passaom-av1: Speed 8 Realtimebasis: ETC1Sbasis: UASTC Level 0basis: UASTC Level 2basis: UASTC Level 3luaradio: Five Back to Back FIR Filtersluaradio: FM Deemphasis Filterluaradio: Hilbert Transformluaradio: Complex Phasemnn: SqueezeNetV1.0mnn: resnet-v2-50mnn: MobileNetV2_224mnn: mobilenet-v1-1.0mnn: inception-v3onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUsimdjson: Kostyasimdjson: LargeRandsimdjson: PartialTweetssimdjson: DistinctUserIDstockfish: Total Timesvt-hevc: 1 - Bosphorus 1080psvt-hevc: 7 - Bosphorus 1080psvt-hevc: 10 - Bosphorus 1080psvt-vp9: VMAF Optimized - Bosphorus 1080psvt-vp9: PSNR/SSIM Optimized - Bosphorus 1080psvt-vp9: Visual Quality Optimized - Bosphorus 1080psysbench: RAM / Memorysysbench: CPUincompact3d: input.i3d 129 Cells Per Directionincompact3d: input.i3d 193 Cells Per Direction1230.213.7711.769.2930.0632.70510.08431.89257.170808.6254.255.4434.18.13343.6565.0383.26555.0954.754863.932941.051231.4595210.59953.124866.5746811.75927.650766.399571.268271.886153586.701859.123589.8716.281018.427421.76831852.811.610693589.541853.050.8331923.563011.130.381.371.48316650337.88120.65258.29216.44221.50169.6215699.7122943.6116.702652066.44664260.203.7711.689.3130.0032.65710.06631.87157.186807.8254.055.4438.28.13043.9855.0953.22754.5474.746403.850961.049211.4202110.58773.124586.5164511.75497.647526.216121.271251.883953588.581856.493589.9716.290718.425721.77421853.801.618663593.321853.250.8344283.562461.120.381.371.48310898247.86121.08257.29218.37221.91169.8115632.4022942.8316.847028166.66696170.213.7611.709.3129.6432.61010.07231.86857.064810.2253.355.4439.28.23644.2325.0993.29254.0904.740453.918521.04891.4402710.59433.130526.5553711.78407.636466.166071.272341.886193592.551854.283592.3016.278218.436921.77741852.341.604823588.051855.590.8307363.564461.130.381.361.48313578967.89120.93255.87219.08222.05170.4815678.4722943.2016.859554366.1601740OpenBenchmarking.org

AOM AV1

Encoder Mode: Speed 0 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 0 Two-Pass3120.04730.09460.14190.18920.2365SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.210.210.201. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 4 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 4 Two-Pass2130.84831.69662.54493.39324.2415SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 33.773.773.761. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 6 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 6 Realtime1323691215SE +/- 0.06, N = 3SE +/- 0.05, N = 3SE +/- 0.06, N = 311.7611.7011.681. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 6 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 6 Two-Pass3213691215SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 39.319.319.291. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 8 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 8 Realtime123714212835SE +/- 0.15, N = 3SE +/- 0.25, N = 3SE +/- 0.20, N = 330.0630.0029.641. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Basis Universal

Settings: ETC1S

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: ETC1S321816243240SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.02, N = 332.6132.6632.711. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 0

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 02313691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 310.0710.0710.081. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 2321714212835SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 331.8731.8731.891. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 33121326395265SE +/- 0.06, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 357.0657.1757.191. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

LuaRadio

Test: Five Back to Back FIR Filters

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Five Back to Back FIR Filters3122004006008001000SE +/- 0.70, N = 3SE +/- 2.34, N = 3SE +/- 2.49, N = 3810.2808.6807.8

LuaRadio

Test: FM Deemphasis Filter

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: FM Deemphasis Filter12360120180240300SE +/- 0.03, N = 3SE +/- 0.09, N = 3SE +/- 0.66, N = 3254.2254.0253.3

LuaRadio

Test: Hilbert Transform

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Hilbert Transform3211224364860SE +/- 0.07, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 355.455.455.4

LuaRadio

Test: Complex Phase

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Complex Phase321100200300400500SE +/- 1.11, N = 3SE +/- 3.89, N = 3SE +/- 3.21, N = 3439.2438.2434.1

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: SqueezeNetV1.0213246810SE +/- 0.048, N = 3SE +/- 0.082, N = 3SE +/- 0.068, N = 38.1308.1338.236MIN: 7.68 / MAX: 12.26MIN: 7.75 / MAX: 14.52MIN: 7.7 / MAX: 9.921. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: resnet-v2-501231020304050SE +/- 0.14, N = 3SE +/- 0.20, N = 3SE +/- 0.17, N = 343.6643.9944.23MIN: 43.28 / MAX: 56.66MIN: 43.06 / MAX: 56.17MIN: 43.21 / MAX: 56.941. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: MobileNetV2_2241231.14732.29463.44194.58925.7365SE +/- 0.022, N = 3SE +/- 0.041, N = 3SE +/- 0.012, N = 35.0385.0955.099MIN: 4.8 / MAX: 5.28MIN: 4.61 / MAX: 8.07MIN: 4.59 / MAX: 10.011. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: mobilenet-v1-1.02130.74071.48142.22212.96283.7035SE +/- 0.027, N = 3SE +/- 0.044, N = 3SE +/- 0.015, N = 33.2273.2653.292MIN: 2.92 / MAX: 16.04MIN: 3.07 / MAX: 3.63MIN: 3.15 / MAX: 3.551. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: inception-v33211224364860SE +/- 0.05, N = 3SE +/- 0.47, N = 3SE +/- 0.48, N = 354.0954.5555.10MIN: 53.71 / MAX: 67.03MIN: 53.67 / MAX: 69.4MIN: 53.83 / MAX: 68.31. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU3211.06982.13963.20944.27925.349SE +/- 0.00145, N = 3SE +/- 0.00637, N = 3SE +/- 0.00438, N = 34.740454.746404.75486MIN: 4.64MIN: 4.5MIN: 4.621. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU2310.88491.76982.65473.53964.4245SE +/- 0.02483, N = 3SE +/- 0.03733, N = 3SE +/- 0.02492, N = 33.850963.918523.93294MIN: 3.76MIN: 3.8MIN: 3.831. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU3210.23650.4730.70950.9461.1825SE +/- 0.00161, N = 3SE +/- 0.00059, N = 3SE +/- 0.00328, N = 31.048901.049211.05123MIN: 1MIN: 1MIN: 11. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU2310.32840.65680.98521.31361.642SE +/- 0.00155, N = 3SE +/- 0.00257, N = 3SE +/- 0.01077, N = 31.420211.440271.45952MIN: 1.37MIN: 1.39MIN: 1.41. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU2313691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 310.5910.5910.60MIN: 9.92MIN: 9.87MIN: 9.951. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU2130.70441.40882.11322.81763.522SE +/- 0.00672, N = 3SE +/- 0.01034, N = 3SE +/- 0.00725, N = 33.124583.124863.13052MIN: 3.02MIN: 3.02MIN: 3.031. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU231246810SE +/- 0.00975, N = 3SE +/- 0.03840, N = 3SE +/- 0.04270, N = 36.516456.555376.57468MIN: 6.45MIN: 6.44MIN: 6.451. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU2133691215SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.05, N = 311.7511.7611.78MIN: 11.36MIN: 11.07MIN: 11.51. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU321246810SE +/- 0.00108, N = 3SE +/- 0.00616, N = 3SE +/- 0.01058, N = 37.636467.647527.65076MIN: 7.62MIN: 7.62MIN: 7.621. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU321246810SE +/- 0.05945, N = 3SE +/- 0.09250, N = 3SE +/- 0.08087, N = 36.166076.216126.39957MIN: 6.04MIN: 6.06MIN: 6.251. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU1230.28630.57260.85891.14521.4315SE +/- 0.00141, N = 3SE +/- 0.00167, N = 3SE +/- 0.00498, N = 31.268271.271251.27234MIN: 1.26MIN: 1.26MIN: 1.261. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU2130.42440.84881.27321.69762.122SE +/- 0.00013, N = 3SE +/- 0.00222, N = 3SE +/- 0.00235, N = 31.883951.886151.88619MIN: 1.88MIN: 1.88MIN: 1.881. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU1238001600240032004000SE +/- 1.49, N = 3SE +/- 1.40, N = 3SE +/- 5.39, N = 33586.703588.583592.55MIN: 3583.33MIN: 3583.51MIN: 3582.491. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU321400800120016002000SE +/- 0.29, N = 3SE +/- 5.14, N = 3SE +/- 1.06, N = 31854.281856.491859.12MIN: 1851.24MIN: 1849.03MIN: 1850.631. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU1238001600240032004000SE +/- 1.80, N = 3SE +/- 3.00, N = 3SE +/- 5.29, N = 33589.873589.973592.30MIN: 3583.15MIN: 3582.71MIN: 3584.031. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU31248121620SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 316.2816.2816.29MIN: 16.26MIN: 16.27MIN: 16.261. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU213510152025SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 318.4318.4318.44MIN: 18.28MIN: 18.28MIN: 18.281. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU123510152025SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 321.7721.7721.78MIN: 21.68MIN: 21.66MIN: 21.691. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU312400800120016002000SE +/- 0.88, N = 3SE +/- 0.49, N = 3SE +/- 1.78, N = 31852.341852.811853.80MIN: 1848.76MIN: 1850.49MIN: 1848.591. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU3120.36420.72841.09261.45681.821SE +/- 0.00500, N = 3SE +/- 0.00563, N = 3SE +/- 0.00101, N = 31.604821.610691.61866MIN: 1.56MIN: 1.57MIN: 1.571. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU3128001600240032004000SE +/- 0.91, N = 3SE +/- 2.28, N = 3SE +/- 4.71, N = 33588.053589.543593.32MIN: 3584.71MIN: 3583MIN: 3583.231. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU123400800120016002000SE +/- 0.80, N = 3SE +/- 1.77, N = 3SE +/- 4.07, N = 31853.051853.251855.59MIN: 1849.21MIN: 1848.5MIN: 1848.041. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU3120.18770.37540.56310.75080.9385SE +/- 0.004141, N = 3SE +/- 0.005518, N = 3SE +/- 0.000758, N = 30.8307360.8331920.834428MIN: 0.8MIN: 0.8MIN: 0.791. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU2130.8021.6042.4063.2084.01SE +/- 0.00246, N = 3SE +/- 0.00273, N = 3SE +/- 0.00178, N = 33.562463.563013.56446MIN: 3.49MIN: 3.51MIN: 3.521. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

simdjson

Throughput Test: Kostya

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: Kostya3120.25430.50860.76291.01721.2715SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 41.131.131.121. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: LargeRandom

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: LargeRandom3210.08550.1710.25650.3420.4275SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.380.380.381. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: PartialTweets2130.30830.61660.92491.23321.5415SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.371.371.361. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: DistinctUserID3210.3330.6660.9991.3321.665SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.481.481.481. (CXX) g++ options: -O3 -pthread

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total Time1327M14M21M28M35MSE +/- 284219.69, N = 3SE +/- 418440.60, N = 4SE +/- 411515.60, N = 53166503331357896310898241. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver

SVT-HEVC

Tuning: 1 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080p312246810SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 37.897.887.861. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080p231306090120150SE +/- 0.10, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 3121.08120.93120.651. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080p12360120180240300SE +/- 0.64, N = 3SE +/- 0.42, N = 3SE +/- 0.38, N = 3258.29257.29255.871. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: VMAF Optimized - Input: Bosphorus 1080p32150100150200250SE +/- 1.43, N = 3SE +/- 0.27, N = 3SE +/- 0.35, N = 3219.08218.37216.441. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p32150100150200250SE +/- 0.82, N = 3SE +/- 0.07, N = 3SE +/- 0.92, N = 3222.05221.91221.501. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 1080p3214080120160200SE +/- 0.77, N = 3SE +/- 0.35, N = 3SE +/- 0.45, N = 3170.48169.81169.621. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

Sysbench

Test: RAM / Memory

OpenBenchmarking.orgMiB/sec, More Is BetterSysbench 1.0.20Test: RAM / Memory1323K6K9K12K15KSE +/- 44.47, N = 3SE +/- 4.37, N = 3SE +/- 37.93, N = 315699.7115678.4715632.401. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPU1325K10K15K20K25KSE +/- 0.67, N = 3SE +/- 1.02, N = 3SE +/- 0.76, N = 322943.6122943.2022942.831. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm

Xcompact3d Incompact3d

Input: input.i3d 129 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Direction12348121620SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 316.7016.8516.861. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per Direction3121530456075SE +/- 0.37, N = 3SE +/- 0.08, N = 3SE +/- 0.33, N = 366.1666.4566.671. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi


Phoronix Test Suite v10.8.4