2990WX March

AMD Ryzen Threadripper 2990WX 32-Core testing with a ASUS ROG ZENITH EXTREME (1701 BIOS) and Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB on Ubuntu 20.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2103152-HA-2990WXMAR69&grt&sor.

2990WX MarchProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen Resolution123AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads)ASUS ROG ZENITH EXTREME (1701 BIOS)AMD 17h32GBSamsung SSD 970 EVO 500GB + 250GB Western Digital WDS250G2X0C-00L350Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB (1244/1750MHz)Realtek ALC1220LG Ultra HDIntel I211 + Qualcomm Atheros QCA6174 802.11ac + Wilocity Wil6200 802.11adUbuntu 20.105.8.0-44-generic (x86_64)GNOME Shell 3.38.1X Server 1.20.94.6 Mesa 20.2.1 (LLVM 11.0.0)1.2.131GCC 10.2.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820dPython Details- Python 3.8.6Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected

2990WX Marchaom-av1: Speed 0 Two-Passaom-av1: Speed 4 Two-Passaom-av1: Speed 6 Realtimeaom-av1: Speed 6 Two-Passaom-av1: Speed 8 Realtimeastcenc: Mediumastcenc: Thoroughastcenc: Exhaustivebasis: ETC1Sbasis: UASTC Level 0basis: UASTC Level 2basis: UASTC Level 3gnuradio: Five Back to Back FIR Filtersgnuradio: Signal Source (Cosine)gnuradio: FIR Filtergnuradio: IIR Filtergnuradio: FM Deemphasis Filtergnuradio: Hilbert Transformjpegxl: PNG - 5jpegxl: PNG - 7jpegxl: PNG - 8jpegxl: JPEG - 5jpegxl: JPEG - 7jpegxl: JPEG - 8jpegxl-decode: 1jpegxl-decode: Allliquid-dsp: 1 - 256 - 57liquid-dsp: 2 - 256 - 57liquid-dsp: 4 - 256 - 57liquid-dsp: 8 - 256 - 57liquid-dsp: 16 - 256 - 57liquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 57luaradio: Five Back to Back FIR Filtersluaradio: FM Deemphasis Filterluaradio: Hilbert Transformluaradio: Complex Phasemnn: SqueezeNetV1.0mnn: resnet-v2-50mnn: MobileNetV2_224mnn: mobilenet-v1-1.0mnn: inception-v3onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUsimdjson: Kostyasimdjson: LargeRandsimdjson: PartialTweetssimdjson: DistinctUserIDsrslte: OFDM_Testsrslte: PHY_DL_Testsrslte: PHY_DL_Testsysbench: RAM / Memorysysbench: CPU1230.214.5117.2413.3352.035.20656.394445.642027.4347.49315.67125.058369.93139.8601.9575.8809.1411.958.848.260.7253.2653.4722.9132.45173.916062766712095333323671333346031000083561333314777666671616966667493.1405.098.1570.69.06238.0155.8714.31048.0486.6819412.45042.511773.5752120.40346.960315.9356725.46452.631223.8001613898.83886.8613965.63918.541.9161714208.13947.671.559882.280.922.963.3282300000227.986.26791.1257045.060.214.4717.1513.3853.325.19206.387146.010927.1087.47715.72325.163384.13176.9599.5584.0804.0410.358.358.260.7153.6054.0823.4932.59174.355991700011907666723678000046058666783538000014746333331616600000488.1403.297.4565.68.78338.1515.9104.23448.0826.6814512.09622.528783.5922420.31166.966715.9299225.35402.629573.7918513707.43917.8013883.63897.241.8911214041.23911.831.556832.290.922.973.3382266667228.786.26809.3157054.820.214.5017.1913.3353.855.21636.401545.710127.5167.49515.79225.160373.63144.5607.0580.3791.9412.458.678.250.7252.8553.1923.3032.55174.956027266712015000023734000046067666783632333314788666671614700000491.9404.797.4567.38.91237.5615.7184.04147.2106.7247612.52692.537473.6045420.40696.987315.9354225.41722.627733.7890413632.53817.6413706.63912.251.7670613829.23928.481.565112.290.922.973.3382733333229.186.56733.7057053.00OpenBenchmarking.org

AOM AV1

Encoder Mode: Speed 0 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 0 Two-Pass3210.04730.09460.14190.18920.2365SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.210.210.211. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 4 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 4 Two-Pass1321.01482.02963.04444.05925.074SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 34.514.504.471. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 6 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 6 Realtime13248121620SE +/- 0.02, N = 3SE +/- 0.09, N = 3SE +/- 0.04, N = 317.2417.1917.151. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 6 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 6 Two-Pass2313691215SE +/- 0.12, N = 3SE +/- 0.08, N = 3SE +/- 0.09, N = 313.3813.3313.331. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 8 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.1-rcEncoder Mode: Speed 8 Realtime3211224364860SE +/- 0.00, N = 3SE +/- 0.82, N = 3SE +/- 0.83, N = 353.8553.3252.031. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.4Preset: Medium2131.17372.34743.52114.69485.8685SE +/- 0.0034, N = 3SE +/- 0.0362, N = 3SE +/- 0.0222, N = 35.19205.20655.21631. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.4Preset: Thorough213246810SE +/- 0.0067, N = 3SE +/- 0.0026, N = 3SE +/- 0.0153, N = 36.38716.39446.40151. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 2.4Preset: Exhaustive1321020304050SE +/- 0.05, N = 3SE +/- 0.06, N = 3SE +/- 0.30, N = 345.6445.7146.011. (CXX) g++ options: -O3 -flto -pthread

Basis Universal

Settings: ETC1S

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: ETC1S213612182430SE +/- 0.16, N = 3SE +/- 0.27, N = 3SE +/- 0.05, N = 327.1127.4327.521. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 0

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 0213246810SE +/- 0.011, N = 3SE +/- 0.013, N = 3SE +/- 0.038, N = 37.4777.4937.4951. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 212348121620SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.04, N = 315.6715.7215.791. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 3132612182430SE +/- 0.06, N = 3SE +/- 0.04, N = 3SE +/- 0.09, N = 325.0625.1625.161. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

GNU Radio

Test: Five Back to Back FIR Filters

OpenBenchmarking.orgMiB/s, More Is BetterGNU RadioTest: Five Back to Back FIR Filters23180160240320400SE +/- 3.80, N = 3SE +/- 4.94, N = 9SE +/- 5.54, N = 9384.1373.6369.91. 3.8.1.0

GNU Radio

Test: Signal Source (Cosine)

OpenBenchmarking.orgMiB/s, More Is BetterGNU RadioTest: Signal Source (Cosine)2317001400210028003500SE +/- 18.55, N = 3SE +/- 12.27, N = 9SE +/- 11.69, N = 93176.93144.53139.81. 3.8.1.0

GNU Radio

Test: FIR Filter

OpenBenchmarking.orgMiB/s, More Is BetterGNU RadioTest: FIR Filter312130260390520650SE +/- 1.71, N = 9SE +/- 4.29, N = 9SE +/- 2.03, N = 3607.0601.9599.51. 3.8.1.0

GNU Radio

Test: IIR Filter

OpenBenchmarking.orgMiB/s, More Is BetterGNU RadioTest: IIR Filter231130260390520650SE +/- 1.47, N = 3SE +/- 1.18, N = 9SE +/- 3.61, N = 9584.0580.3575.81. 3.8.1.0

GNU Radio

Test: FM Deemphasis Filter

OpenBenchmarking.orgMiB/s, More Is BetterGNU RadioTest: FM Deemphasis Filter1232004006008001000SE +/- 2.77, N = 9SE +/- 2.87, N = 3SE +/- 11.33, N = 9809.1804.0791.91. 3.8.1.0

GNU Radio

Test: Hilbert Transform

OpenBenchmarking.orgMiB/s, More Is BetterGNU RadioTest: Hilbert Transform31290180270360450SE +/- 1.42, N = 9SE +/- 2.03, N = 9SE +/- 1.57, N = 3412.4411.9410.31. 3.8.1.0

JPEG XL

Input: PNG - Encode Speed: 5

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: PNG - Encode Speed: 51321326395265SE +/- 0.09, N = 3SE +/- 0.89, N = 3SE +/- 0.38, N = 358.8458.6758.351. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: PNG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: PNG - Encode Speed: 7213246810SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 38.268.268.251. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: PNG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: PNG - Encode Speed: 83120.1620.3240.4860.6480.81SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.720.720.711. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: JPEG - Encode Speed: 5

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 52131224364860SE +/- 0.61, N = 3SE +/- 0.28, N = 3SE +/- 0.76, N = 353.6053.2652.851. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: JPEG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 72131224364860SE +/- 0.05, N = 3SE +/- 0.06, N = 3SE +/- 0.40, N = 354.0853.4753.191. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -fPIE -pie -ldl

JPEG XL

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 8231612182430SE +/- 0.15, N = 3SE +/- 0.15, N = 3SE +/- 0.16, N = 323.4923.3022.911. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -fPIE -pie -ldl

JPEG XL Decoding

CPU Threads: 1

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding 0.3.3CPU Threads: 1231816243240SE +/- 0.07, N = 3SE +/- 0.04, N = 3SE +/- 0.01, N = 332.5932.5532.45

JPEG XL Decoding

CPU Threads: All

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding 0.3.3CPU Threads: All3214080120160200SE +/- 0.29, N = 3SE +/- 0.60, N = 3SE +/- 0.36, N = 3174.95174.35173.91

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 1 - Buffer Length: 256 - Filter Length: 5713213M26M39M52M65MSE +/- 6385.75, N = 3SE +/- 229485.17, N = 3SE +/- 65592.17, N = 36062766760272667599170001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 2 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 2 - Buffer Length: 256 - Filter Length: 5713230M60M90M120M150MSE +/- 8819.17, N = 3SE +/- 165025.25, N = 3SE +/- 524859.77, N = 31209533331201500001190766671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 4 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 4 - Buffer Length: 256 - Filter Length: 5732150M100M150M200M250MSE +/- 585946.53, N = 3SE +/- 528488.41, N = 3SE +/- 161279.61, N = 32373400002367800002367133331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 8 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 8 - Buffer Length: 256 - Filter Length: 57321100M200M300M400M500MSE +/- 870868.02, N = 3SE +/- 668539.04, N = 3SE +/- 920887.25, N = 34606766674605866674603100001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57312200M400M600M800M1000MSE +/- 670232.13, N = 3SE +/- 414782.41, N = 3SE +/- 1629918.20, N = 38363233338356133338353800001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57312300M600M900M1200M1500MSE +/- 4056818.68, N = 3SE +/- 2370185.18, N = 3SE +/- 5967783.88, N = 31478866667147776666714746333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 64 - Buffer Length: 256 - Filter Length: 57123300M600M900M1200M1500MSE +/- 1354416.64, N = 3SE +/- 1026320.29, N = 3SE +/- 1193035.34, N = 31616966667161660000016147000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

LuaRadio

Test: Five Back to Back FIR Filters

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Five Back to Back FIR Filters132110220330440550SE +/- 1.15, N = 3SE +/- 1.14, N = 3SE +/- 0.58, N = 3493.1491.9488.1

LuaRadio

Test: FM Deemphasis Filter

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: FM Deemphasis Filter13290180270360450SE +/- 0.51, N = 3SE +/- 0.90, N = 3SE +/- 2.02, N = 3405.0404.7403.2

LuaRadio

Test: Hilbert Transform

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Hilbert Transform13220406080100SE +/- 0.17, N = 3SE +/- 0.45, N = 3SE +/- 0.52, N = 398.197.497.4

LuaRadio

Test: Complex Phase

OpenBenchmarking.orgMiB/s, More Is BetterLuaRadio 0.9.1Test: Complex Phase132120240360480600SE +/- 0.23, N = 3SE +/- 3.02, N = 3SE +/- 2.48, N = 3570.6567.3565.6

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: SqueezeNetV1.02313691215SE +/- 0.105, N = 15SE +/- 0.108, N = 3SE +/- 0.095, N = 158.7838.9129.062MIN: 8 / MAX: 22.86MIN: 8.28 / MAX: 17.91MIN: 8.06 / MAX: 18.981. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: resnet-v2-50312918273645SE +/- 0.53, N = 3SE +/- 0.30, N = 15SE +/- 0.34, N = 1537.5638.0238.15MIN: 35.2 / MAX: 119.58MIN: 35.17 / MAX: 122.04MIN: 35.3 / MAX: 124.031. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: MobileNetV2_2243121.32982.65963.98945.31926.649SE +/- 0.211, N = 3SE +/- 0.069, N = 15SE +/- 0.058, N = 155.7185.8715.910MIN: 5.38 / MAX: 6.46MIN: 5.4 / MAX: 14.78MIN: 5.42 / MAX: 7.121. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: mobilenet-v1-1.03210.96981.93962.90943.87924.849SE +/- 0.114, N = 3SE +/- 0.056, N = 15SE +/- 0.063, N = 154.0414.2344.310MIN: 3.38 / MAX: 29.44MIN: 3.39 / MAX: 44.28MIN: 3.38 / MAX: 40.781. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.1.3Model: inception-v33121122334455SE +/- 0.37, N = 3SE +/- 0.34, N = 15SE +/- 0.26, N = 1547.2148.0548.08MIN: 43.85 / MAX: 97.15MIN: 44.63 / MAX: 136.06MIN: 45.19 / MAX: 105.011. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU213246810SE +/- 0.02263, N = 3SE +/- 0.09926, N = 3SE +/- 0.01807, N = 36.681456.681946.72476MIN: 5.95MIN: 5.95MIN: 5.91. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU2133691215SE +/- 0.16, N = 4SE +/- 0.13, N = 15SE +/- 0.14, N = 1512.1012.4512.53MIN: 11.23MIN: 11.43MIN: 11.331. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU1230.57091.14181.71272.28362.8545SE +/- 0.00876, N = 3SE +/- 0.02431, N = 3SE +/- 0.02758, N = 32.511772.528782.53747MIN: 2.32MIN: 2.33MIN: 2.321. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU1230.8111.6222.4333.2444.055SE +/- 0.00442, N = 3SE +/- 0.00737, N = 3SE +/- 0.01752, N = 33.575213.592243.60454MIN: 3.17MIN: 3.18MIN: 1.971. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU213510152025SE +/- 0.13, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 320.3120.4020.41MIN: 18.74MIN: 19.97MIN: 19.861. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU123246810SE +/- 0.01479, N = 3SE +/- 0.01848, N = 3SE +/- 0.02797, N = 36.960316.966716.98731MIN: 6.16MIN: 6.12MIN: 6.091. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU2311.33552.6714.00655.3426.6775SE +/- 0.00690, N = 3SE +/- 0.00102, N = 3SE +/- 0.01143, N = 35.929925.935425.93567MIN: 5.5MIN: 5.49MIN: 5.521. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU231612182430SE +/- 0.02, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 325.3525.4225.46MIN: 24.3MIN: 24.2MIN: 24.651. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU3210.5921.1841.7762.3682.96SE +/- 0.00732, N = 3SE +/- 0.00861, N = 3SE +/- 0.00877, N = 32.627732.629572.63122MIN: 2.51MIN: 2.5MIN: 2.51. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU3210.8551.712.5653.424.275SE +/- 0.00762, N = 3SE +/- 0.00167, N = 3SE +/- 0.00430, N = 33.789043.791853.80016MIN: 3.71MIN: 3.71MIN: 3.721. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU3213K6K9K12K15KSE +/- 176.85, N = 5SE +/- 145.74, N = 7SE +/- 154.68, N = 713632.513707.413898.8MIN: 12827.2MIN: 12922.4MIN: 13032.71. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU3128001600240032004000SE +/- 39.34, N = 3SE +/- 17.43, N = 3SE +/- 27.48, N = 33817.643886.863917.80MIN: 3695.79MIN: 3655.72MIN: 3769.841. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU3213K6K9K12K15KSE +/- 149.77, N = 12SE +/- 182.34, N = 5SE +/- 178.47, N = 513706.613883.613965.6MIN: 12260.3MIN: 12751.7MIN: 13115.71. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU2318001600240032004000SE +/- 11.92, N = 3SE +/- 15.33, N = 3SE +/- 20.29, N = 33897.243912.253918.54MIN: 3835.75MIN: 3811.14MIN: 38551. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU3210.43110.86221.29331.72442.1555SE +/- 0.06494, N = 15SE +/- 0.06076, N = 15SE +/- 0.05033, N = 151.767061.891121.91617MIN: 1.22MIN: 1.25MIN: 1.231. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU3213K6K9K12K15KSE +/- 92.53, N = 3SE +/- 125.70, N = 3SE +/- 92.65, N = 313829.214041.214208.1MIN: 13621MIN: 13720.3MIN: 13948.21. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU2318001600240032004000SE +/- 8.79, N = 3SE +/- 15.75, N = 3SE +/- 27.19, N = 33911.833928.483947.67MIN: 3788.89MIN: 3822.06MIN: 3865.391. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU2130.35210.70421.05631.40841.7605SE +/- 0.00329, N = 3SE +/- 0.00396, N = 3SE +/- 0.00128, N = 31.556831.559881.56511MIN: 1.43MIN: 1.43MIN: 1.441. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

simdjson

Throughput Test: Kostya

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: Kostya3210.51531.03061.54592.06122.5765SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 32.292.292.281. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: LargeRandom

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: LargeRandom3210.2070.4140.6210.8281.035SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.920.920.921. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: PartialTweets3210.66831.33662.00492.67323.3415SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 32.972.972.961. (CXX) g++ options: -O3 -pthread

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 0.8.2Throughput Test: DistinctUserID3210.74931.49862.24792.99723.7465SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 33.333.333.321. (CXX) g++ options: -O3 -pthread

srsLTE

Test: OFDM_Test

OpenBenchmarking.orgSamples / Second, More Is BettersrsLTE 20.10.1Test: OFDM_Test31220M40M60M80M100MSE +/- 133333.33, N = 3SE +/- 57735.03, N = 3SE +/- 166666.67, N = 38273333382300000822666671. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f

srsLTE

Test: PHY_DL_Test

OpenBenchmarking.orgeNb Mb/s, More Is BettersrsLTE 20.10.1Test: PHY_DL_Test32150100150200250SE +/- 0.86, N = 3SE +/- 1.76, N = 3SE +/- 1.36, N = 3229.1228.7227.91. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f

srsLTE

Test: PHY_DL_Test

OpenBenchmarking.orgUE Mb/s, More Is BettersrsLTE 20.10.1Test: PHY_DL_Test32120406080100SE +/- 0.37, N = 3SE +/- 0.43, N = 3SE +/- 0.48, N = 386.586.286.21. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f

Sysbench

Test: RAM / Memory

OpenBenchmarking.orgMiB/sec, More Is BetterSysbench 1.0.20Test: RAM / Memory21315003000450060007500SE +/- 32.29, N = 3SE +/- 64.27, N = 3SE +/- 33.94, N = 36809.316791.126733.701. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPU23112K24K36K48K60KSE +/- 3.63, N = 3SE +/- 1.70, N = 3SE +/- 2.08, N = 357054.8257053.0057045.061. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm


Phoronix Test Suite v10.8.4