2 x Intel Xeon Platinum 8490H testing with a Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2304136-NE-8490HAPRI45 8490h april - Phoronix Test Suite 8490h april 2 x Intel Xeon Platinum 8490H testing with a Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2304136-NE-8490HAPRI45&sor&grw .
8490h april Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b c d e 2 x Intel Xeon Platinum 8490H @ 3.50GHz (120 Cores / 240 Threads) Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) Intel Device 1bce 16 x 64 GB 4800MT/s Samsung M321R8GA0BB0-CQKEG 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 + 960GB INTEL SSDSC2KG96 ASPEED VGA HDMI 4 x Intel E810-C for QSFP + 2 x Intel X710 for 10GBASE-T Ubuntu 22.04 6.2.0-060200rc7daily20230208-generic (x86_64) GNOME Shell 42.2 X Server 1.21.1.3 1.2.204 GCC 11.3.0 + Clang 14.0.0-1ubuntu1 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
8490h april tensorflow: CPU - 16 - AlexNet tensorflow: CPU - 32 - AlexNet tensorflow: CPU - 64 - AlexNet tensorflow: CPU - 256 - AlexNet tensorflow: CPU - 512 - AlexNet tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - GoogLeNet tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - GoogLeNet tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 256 - GoogLeNet tensorflow: CPU - 256 - ResNet-50 tensorflow: CPU - 512 - GoogLeNet tensorflow: CPU - 512 - ResNet-50 onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU blender: BMW27 - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only vvenc: Bosphorus 4K - Fast vvenc: Bosphorus 4K - Faster vvenc: Bosphorus 1080p - Fast vvenc: Bosphorus 1080p - Faster srsran: Downlink Processor Benchmark srsran: PUSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Thread nginx: 500 apache: 500 apache: 1000 a b c d e 372.88 531.68 743.73 1091.42 1227.69 173.64 64.28 257.33 83.13 348 103.48 444.17 130.44 472.26 135.88 3.56759 2.49757 5.19379 0.872476 5.88472 3.16779 0.408711 14.2667 0.724413 0.23996 0.434658 0.228971 1216.99 881.232 1304.57 0.228741 0.470336 0.464269 873.147 1205.29 861.148 14.03 36.5 19.36 147.25 48.81 6.308 10.065 17.147 28.66 326.5 7122.4 29.9 250533.37 80395.59 386.55 556.34 741.87 1077.57 1231.85 185.78 64.31 267.02 84.42 342.26 102.21 442.93 128.89 465.31 134.34 3.05000 2.67699 4.62478 0.978428 5.38734 3.04638 0.402983 14.6284 0.718746 0.314427 0.410029 0.225197 1155.77 840.956 1232.87 0.223142 0.451393 0.466045 832.338 1184.14 878.489 14.20 36.66 19.70 147.73 47.84 6.332 10.055 17.396 30.988 324.2 6898.6 29.8 246156.11 83834.81 370.67 557.68 751.67 1063.47 1214.36 176.84 63.78 265.08 84.98 334.11 104.52 437.97 127.52 467.33 135.22 3.63585 2.52848 5.30769 1.15332 5.42071 2.91381 0.405677 14.5444 0.716419 0.433523 0.397435 0.219341 1209.39 852.577 1081.7 0.21742 0.457893 0.457996 844.358 1228.77 904.268 14.21 36.79 19.94 146.59 47.65 6.314 9.967 17.244 30.211 326.7 6547.4 29.7 246619.54 77777.03 391.88 536.63 739.02 1071.62 1225.54 184.8 63.97 249.74 84.17 346.11 103.14 441.29 128.8 462.37 134.76 3.50485 2.37479 4.87548 0.981361 4.97718 2.83754 0.408416 14.4891 0.711306 0.296152 0.391957 0.225742 1182.32 731.095 1205.38 0.21949 0.44041 0.462589 832.574 1184.12 888.732 14.04 36.31 20.13 147.18 47.73 6.443 9.956 17.211 27.619 320.8 7079.5 28.8 247581.64 84694.76 386.34 564.79 745.33 1062.06 1230.3 185.22 64.96 270.31 83.45 346.2 102.87 441.44 128.23 469.14 133.9 3.44677 2.80869 5.34755 0.989308 5.558 3.02188 0.400325 14.2212 0.712248 0.305503 0.413735 0.219348 1120.64 848.652 1200.19 0.22202 0.446232 0.453885 845.726 1112.04 818.438 14.3 36.36 19.54 148.11 47.43 6.388 10.067 16.789 30.369 324.1 6774.5 28.9 248416.85 85357.84 OpenBenchmarking.org
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet d b e a c 90 180 270 360 450 SE +/- 3.12, N = 3 391.88 386.55 386.34 372.88 370.67
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet e c b d a 120 240 360 480 600 SE +/- 5.22, N = 6 564.79 557.68 556.34 536.63 531.68
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet c e a b d 160 320 480 640 800 SE +/- 6.00, N = 3 751.67 745.33 743.73 741.87 739.02
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet a b d c e 200 400 600 800 1000 SE +/- 5.13, N = 3 1091.42 1077.57 1071.62 1063.47 1062.06
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet b e a d c 300 600 900 1200 1500 SE +/- 2.69, N = 3 1231.85 1230.30 1227.69 1225.54 1214.36
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet b e d c a 40 80 120 160 200 SE +/- 1.60, N = 3 185.78 185.22 184.80 176.84 173.64
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 e b a d c 14 28 42 56 70 SE +/- 0.45, N = 3 64.96 64.31 64.28 63.97 63.78
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet e b c a d 60 120 180 240 300 SE +/- 0.97, N = 3 270.31 267.02 265.08 257.33 249.74
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 c b d e a 20 40 60 80 100 SE +/- 0.33, N = 3 84.98 84.42 84.17 83.45 83.13
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet a e d b c 80 160 240 320 400 SE +/- 2.89, N = 3 348.00 346.20 346.11 342.26 334.11
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 c a d e b 20 40 60 80 100 SE +/- 0.44, N = 3 104.52 103.48 103.14 102.87 102.21
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet a b e d c 100 200 300 400 500 SE +/- 3.52, N = 3 444.17 442.93 441.44 441.29 437.97
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: ResNet-50 a b d e c 30 60 90 120 150 SE +/- 0.05, N = 3 130.44 128.89 128.80 128.23 127.52
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet a e c b d 100 200 300 400 500 SE +/- 4.06, N = 3 472.26 469.14 467.33 465.31 462.37
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: ResNet-50 a c d b e 30 60 90 120 150 SE +/- 1.30, N = 3 135.88 135.22 134.76 134.34 133.90
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU b e d a c 0.8181 1.6362 2.4543 3.2724 4.0905 SE +/- 0.17695, N = 15 3.05000 3.44677 3.50485 3.56759 3.63585 MIN: 1.6 MIN: 3.04 MIN: 2.9 MIN: 3.02 MIN: 3.11 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU d a c b e 0.632 1.264 1.896 2.528 3.16 SE +/- 0.03552, N = 3 2.37479 2.49757 2.52848 2.67699 2.80869 MIN: 1.92 MIN: 2.05 MIN: 2.05 MIN: 2.13 MIN: 2.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU b d a c e 1.2032 2.4064 3.6096 4.8128 6.016 SE +/- 0.18778, N = 12 4.62478 4.87548 5.19379 5.30769 5.34755 MIN: 2.46 MIN: 3.78 MIN: 3.98 MIN: 3.99 MIN: 4.19 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b d e c 0.2595 0.519 0.7785 1.038 1.2975 SE +/- 0.002492, N = 3 0.872476 0.978428 0.981361 0.989308 1.153320 MIN: 0.67 MIN: 0.77 MIN: 0.78 MIN: 0.78 MIN: 0.92 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU d b c e a 1.3241 2.6482 3.9723 5.2964 6.6205 SE +/- 0.08402, N = 15 4.97718 5.38734 5.42071 5.55800 5.88472 MIN: 3.92 MIN: 3.77 MIN: 4.25 MIN: 4.37 MIN: 4.65 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU d c e b a 0.7128 1.4256 2.1384 2.8512 3.564 SE +/- 0.03679, N = 15 2.83754 2.91381 3.02188 3.04638 3.16779 MIN: 2.21 MIN: 2.28 MIN: 2.44 MIN: 2.17 MIN: 2.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU e b c d a 0.092 0.184 0.276 0.368 0.46 SE +/- 0.000551, N = 3 0.400325 0.402983 0.405677 0.408416 0.408711 MIN: 0.36 MIN: 0.36 MIN: 0.36 MIN: 0.36 MIN: 0.36 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU e a d c b 4 8 12 16 20 SE +/- 0.05, N = 3 14.22 14.27 14.49 14.54 14.63 MIN: 12.7 MIN: 12.67 MIN: 12.72 MIN: 12.86 MIN: 12.83 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU d e c b a 0.163 0.326 0.489 0.652 0.815 SE +/- 0.002808, N = 3 0.711306 0.712248 0.716419 0.718746 0.724413 MIN: 0.65 MIN: 0.66 MIN: 0.66 MIN: 0.66 MIN: 0.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a d e b c 0.0975 0.195 0.2925 0.39 0.4875 SE +/- 0.019200, N = 15 0.239960 0.296152 0.305503 0.314427 0.433523 MIN: 0.18 MIN: 0.18 MIN: 0.18 MIN: 0.17 MIN: 0.18 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU d c b e a 0.0978 0.1956 0.2934 0.3912 0.489 SE +/- 0.003330, N = 15 0.391957 0.397435 0.410029 0.413735 0.434658 MIN: 0.32 MIN: 0.32 MIN: 0.31 MIN: 0.33 MIN: 0.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU c e b d a 0.0515 0.103 0.1545 0.206 0.2575 SE +/- 0.001233, N = 3 0.219341 0.219348 0.225197 0.225742 0.228971 MIN: 0.2 MIN: 0.21 MIN: 0.2 MIN: 0.21 MIN: 0.2 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU e b d c a 300 600 900 1200 1500 SE +/- 38.79, N = 12 1120.64 1155.77 1182.32 1209.39 1216.99 MIN: 1089.24 MIN: 781.24 MIN: 1123.65 MIN: 1153.33 MIN: 1149.61 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU d b e c a 200 400 600 800 1000 SE +/- 10.27, N = 15 731.10 840.96 848.65 852.58 881.23 MIN: 715.12 MIN: 756.78 MIN: 823.74 MIN: 818.16 MIN: 840.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU c e d b a 300 600 900 1200 1500 SE +/- 23.27, N = 14 1081.70 1200.19 1205.38 1232.87 1304.57 MIN: 1010 MIN: 1170.19 MIN: 1177.67 MIN: 1015.69 MIN: 1219.16 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU c d e b a 0.0515 0.103 0.1545 0.206 0.2575 SE +/- 0.002565, N = 3 0.217420 0.219490 0.222020 0.223142 0.228741 MIN: 0.19 MIN: 0.2 MIN: 0.2 MIN: 0.19 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU d e b c a 0.1058 0.2116 0.3174 0.4232 0.529 SE +/- 0.003398, N = 11 0.440410 0.446232 0.451393 0.457893 0.470336 MIN: 0.35 MIN: 0.35 MIN: 0.34 MIN: 0.35 MIN: 0.36 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU e c d a b 0.1049 0.2098 0.3147 0.4196 0.5245 SE +/- 0.002656, N = 3 0.453885 0.457996 0.462589 0.464269 0.466045 MIN: 0.4 MIN: 0.39 MIN: 0.37 MIN: 0.38 MIN: 0.38 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU b d c e a 200 400 600 800 1000 SE +/- 14.33, N = 15 832.34 832.57 844.36 845.73 873.15 MIN: 744.45 MIN: 807.52 MIN: 819.84 MIN: 832.31 MIN: 841.29 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU e d b a c 300 600 900 1200 1500 SE +/- 17.42, N = 15 1112.04 1184.12 1184.14 1205.29 1228.77 MIN: 1093.03 MIN: 1154.2 MIN: 1007.85 MIN: 1166.28 MIN: 1195.53 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU e a b d c 200 400 600 800 1000 SE +/- 9.64, N = 5 818.44 861.15 878.49 888.73 904.27 MIN: 804.26 MIN: 828.35 MIN: 833.55 MIN: 874.25 MIN: 846.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: BMW27 - Compute: CPU-Only a d b c e 4 8 12 16 20 SE +/- 0.15, N = 4 14.03 14.04 14.20 14.21 14.30
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Classroom - Compute: CPU-Only d e a b c 8 16 24 32 40 SE +/- 0.30, N = 3 36.31 36.36 36.50 36.66 36.79
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Fishy Cat - Compute: CPU-Only a e b c d 5 10 15 20 25 SE +/- 0.09, N = 3 19.36 19.54 19.70 19.94 20.13
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Barbershop - Compute: CPU-Only c d a b e 30 60 90 120 150 SE +/- 0.81, N = 3 146.59 147.18 147.25 147.73 148.11
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Pabellon Barcelona - Compute: CPU-Only e c d b a 11 22 33 44 55 SE +/- 0.10, N = 3 47.43 47.65 47.73 47.84 48.81
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Fast d e b c a 2 4 6 8 10 SE +/- 0.037, N = 3 6.443 6.388 6.332 6.314 6.308 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Faster e a b c d 3 6 9 12 15 SE +/- 0.068, N = 13 10.067 10.065 10.055 9.967 9.956 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Fast b c d a e 4 8 12 16 20 SE +/- 0.04, N = 3 17.40 17.24 17.21 17.15 16.79 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Faster b e c a d 7 14 21 28 35 SE +/- 0.17, N = 3 30.99 30.37 30.21 28.66 27.62 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
srsRAN Project Test: Downlink Processor Benchmark OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.3 Test: Downlink Processor Benchmark c a b e d 70 140 210 280 350 SE +/- 2.03, N = 3 326.7 326.5 324.2 324.1 320.8 MIN: 72.5 / MAX: 731.1 MIN: 71.2 / MAX: 731.7 MIN: 68.9 / MAX: 734.8 MIN: 69.9 / MAX: 729.7 MIN: 71.3 / MAX: 723.1 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -march=native -mfma -lgtest
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.3 Test: PUSCH Processor Benchmark, Throughput Total a d b e c 1500 3000 4500 6000 7500 SE +/- 87.81, N = 9 7122.4 7079.5 6898.6 6774.5 6547.4 MIN: 4599.2 / MAX: 12734.9 MIN: 4942.3 / MAX: 12824.3 MIN: 2932.3 / MAX: 13017.6 MIN: 3650.8 / MAX: 12618.4 MIN: 3614.7 / MAX: 12722 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -march=native -mfma -lgtest
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.3 Test: PUSCH Processor Benchmark, Throughput Thread a b c e d 7 14 21 28 35 SE +/- 0.22, N = 3 29.9 29.8 29.7 28.9 28.8 MIN: 19.5 / MAX: 52.7 MIN: 18.3 / MAX: 53.3 MIN: 18.8 / MAX: 52.3 MIN: 18.9 / MAX: 52.7 MIN: 15.8 / MAX: 52.3 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -march=native -mfma -lgtest
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 a e d c b 50K 100K 150K 200K 250K SE +/- 1323.62, N = 3 250533.37 248416.85 247581.64 246619.54 246156.11 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Apache HTTP Server Concurrent Requests: 500 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 500 e d b a c 20K 40K 60K 80K 100K SE +/- 98.05, N = 3 85357.84 84694.76 83834.81 80395.59 77777.03 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Phoronix Test Suite v10.8.4