Intel Xeon Gold 5218 testing with a Supermicro X11SPL-F v1.02 (3.1 BIOS) and llvmpipe 188GB on Ubuntu 19.10 via the Phoronix Test Suite.
Xeon Gold 5218 Processor: Intel Xeon Gold 5218 @ 3.90GHz (16 Cores / 32 Threads), Motherboard: Supermicro X11SPL-F v1.02 (3.1 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 188GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: llvmpipe 188GB, Monitor: VE228, Network: 2 x Intel I210
OS: Ubuntu 19.10, Kernel: 5.5.0-050500-generic (x86_64), Desktop: GNOME Shell 3.34.1, Display Server: X Server 1.20.5, Display Driver: modesetting 1.20.5, OpenGL: 3.3 Mesa 19.2.8 (LLVM 9.0 256 bits), Compiler: GCC 9.2.1 20191008, File-System: ext4, Screen Resolution: 1920x1080
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500002cPython Notes: Python 2.7.17 + Python 3.7.5Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Mitigation of TSX disabled
MKL-DNN DNNL This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_all - Data Type: bf16bf16bf16 Xeon Gold 5218 1500 3000 4500 6000 7500 SE +/- 3.47, N = 3 6792.39 MIN: 6719.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 Xeon Gold 5218 1700 3400 5100 6800 8500 SE +/- 91.10, N = 3 7700.7 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 64 Xeon Gold 5218 1600 3200 4800 6400 8000 SE +/- 11.65, N = 3 7472.2 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
MKL-DNN DNNL This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_googlenet_v3 - Data Type: bf16bf16bf16 Xeon Gold 5218 110 220 330 440 550 SE +/- 0.32, N = 3 488.92 MIN: 483.01 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 32 Xeon Gold 5218 2K 4K 6K 8K 10K SE +/- 390.57, N = 15 8037.0 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 64 Xeon Gold 5218 1200 2400 3600 4800 6000 SE +/- 14.41, N = 3 5527.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 128 Xeon Gold 5218 1100 2200 3300 4400 5500 SE +/- 72.39, N = 3 5355.8 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
MKL-DNN DNNL This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_1d - Data Type: bf16bf16bf16 Xeon Gold 5218 4 8 12 16 20 SE +/- 0.03, N = 3 16.84 MIN: 16.47 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 256 Xeon Gold 5218 1100 2200 3300 4400 5500 SE +/- 30.00, N = 3 5241.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 512 Xeon Gold 5218 1600 3200 4800 6400 8000 SE +/- 7.05, N = 3 7493.6 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 128 Xeon Gold 5218 1500 3000 4500 6000 7500 SE +/- 164.04, N = 15 7023.2 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 256 Xeon Gold 5218 1300 2600 3900 5200 6500 SE +/- 160.35, N = 15 5960.8 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
MKL-DNN DNNL This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_alexnet - Data Type: bf16bf16bf16 Xeon Gold 5218 400 800 1200 1600 2000 SE +/- 2.46, N = 3 1888.90 MIN: 1880.02 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 512 Xeon Gold 5218 1300 2600 3900 5200 6500 SE +/- 12.16, N = 3 6205.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 1024 Xeon Gold 5218 1600 3200 4800 6400 8000 SE +/- 35.32, N = 3 7572.9 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 2048 Xeon Gold 5218 1500 3000 4500 6000 7500 SE +/- 24.29, N = 3 7180.1 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 Xeon Gold 5218 1500 3000 4500 6000 7500 SE +/- 36.04, N = 3 7056.4 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 1024 Xeon Gold 5218 1300 2600 3900 5200 6500 SE +/- 18.19, N = 3 6198.8 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 2048 Xeon Gold 5218 1000 2000 3000 4000 5000 SE +/- 28.29, N = 3 4837.3 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Xeon Gold 5218 900 1800 2700 3600 4500 SE +/- 6.76, N = 3 4224.6 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 Xeon Gold 5218 3K 6K 9K 12K 15K SE +/- 161.53, N = 3 13727 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 64 Xeon Gold 5218 3K 6K 9K 12K 15K SE +/- 203.68, N = 3 15952 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 Xeon Gold 5218 7K 14K 21K 28K 35K SE +/- 764.03, N = 15 34928 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
MKL-DNN DNNL This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 Xeon Gold 5218 30 60 90 120 150 SE +/- 0.09, N = 3 133.10 MIN: 130.3 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 64 Xeon Gold 5218 6K 12K 18K 24K 30K SE +/- 364.33, N = 3 30014 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 128 Xeon Gold 5218 4K 8K 12K 16K 20K SE +/- 148.93, N = 14 19118 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 256 Xeon Gold 5218 6K 12K 18K 24K 30K SE +/- 362.39, N = 15 27769 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 512 Xeon Gold 5218 8K 16K 24K 32K 40K SE +/- 359.64, N = 15 39235 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 128 Xeon Gold 5218 5K 10K 15K 20K 25K SE +/- 252.28, N = 7 23433 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 256 Xeon Gold 5218 4K 8K 12K 16K 20K SE +/- 261.39, N = 5 20850 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 512 Xeon Gold 5218 4K 8K 12K 16K 20K SE +/- 138.66, N = 3 19298 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
MKL-DNN DNNL This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 Xeon Gold 5218 0.2234 0.4468 0.6702 0.8936 1.117 SE +/- 0.001821, N = 3 0.993071 MIN: 0.96 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch 1D - Data Type: f32 Xeon Gold 5218 1.1891 2.3782 3.5673 4.7564 5.9455 SE +/- 0.04179, N = 3 5.28502 MIN: 4.82 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_all - Data Type: f32 Xeon Gold 5218 400 800 1200 1600 2000 SE +/- 0.75, N = 3 1721.20 MIN: 1693.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_3d - Data Type: f32 Xeon Gold 5218 1.3176 2.6352 3.9528 5.2704 6.588 SE +/- 0.01377, N = 3 5.85599 MIN: 5.76 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_googlenet_v3 - Data Type: u8s8f32 Xeon Gold 5218 8 16 24 32 40 SE +/- 0.03, N = 3 35.29 MIN: 34.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_3d - Data Type: bf16bf16bf16 Xeon Gold 5218 9 18 27 36 45 SE +/- 0.11, N = 3 40.58 MIN: 39.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_all - Data Type: bf16bf16bf16 Xeon Gold 5218 2K 4K 6K 8K 10K SE +/- 2.47, N = 3 10330.3 MIN: 10267 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_3d - Data Type: f32 Xeon Gold 5218 3 6 9 12 15 SE +/- 0.01, N = 3 13.02 MIN: 12.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_all - Data Type: f32 Xeon Gold 5218 500 1000 1500 2000 2500 SE +/- 2.58, N = 3 2395.11 MIN: 2361.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch 1D - Data Type: bf16bf16bf16 Xeon Gold 5218 3 6 9 12 15 SE +/- 0.02626, N = 3 8.99765 MIN: 8.64 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch All - Data Type: bf16bf16bf16 Xeon Gold 5218 3 6 9 12 15 SE +/- 0.02, N = 3 12.94 MIN: 9.64 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch All - Data Type: u8s8f32 Xeon Gold 5218 1.0988 2.1976 3.2964 4.3952 5.494 SE +/- 0.01863, N = 3 4.88367 MIN: 4.67 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch 1D - Data Type: u8s8f32 Xeon Gold 5218 0.2083 0.4166 0.6249 0.8332 1.0415 SE +/- 0.006102, N = 3 0.925650 MIN: 0.87 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch All - Data Type: f32 Xeon Gold 5218 3 6 9 12 15 SE +/- 0.02708, N = 3 9.51973 MIN: 9.06 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 Xeon Gold 5218 1600 3200 4800 6400 8000 SE +/- 5.83, N = 3 7304.37 MIN: 7288.58 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_3d - Data Type: u8s8f32 Xeon Gold 5218 3K 6K 9K 12K 15K SE +/- 6.41, N = 3 11821.6 MIN: 11788.6 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Recurrent Neural Network Training - Data Type: f32 Xeon Gold 5218 50 100 150 200 250 SE +/- 0.79, N = 3 241.11 MIN: 233.62 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_1d - Data Type: f32 Xeon Gold 5218 0.917 1.834 2.751 3.668 4.585 SE +/- 0.01401, N = 3 4.07546 MIN: 3.94 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_alexnet - Data Type: u8s8f32 Xeon Gold 5218 20 40 60 80 100 SE +/- 0.08, N = 3 82.26 MIN: 80.66 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_alexnet - Data Type: f32 Xeon Gold 5218 70 140 210 280 350 SE +/- 0.52, N = 3 301.61 MIN: 297.34 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 1024 Xeon Gold 5218 11K 22K 33K 44K 55K SE +/- 332.27, N = 3 49196 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
MKL-DNN DNNL This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_all - Data Type: u8s8f32 Xeon Gold 5218 1300 2600 3900 5200 6500 SE +/- 4.39, N = 3 5845.32 MIN: 5816.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_3d - Data Type: bf16bf16bf16 Xeon Gold 5218 5 10 15 20 25 SE +/- 0.03, N = 3 21.37 MIN: 21.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 2048 Xeon Gold 5218 10K 20K 30K 40K 50K SE +/- 604.43, N = 13 45883 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 Xeon Gold 5218 10K 20K 30K 40K 50K SE +/- 323.08, N = 3 44840 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 1024 Xeon Gold 5218 4K 8K 12K 16K 20K SE +/- 168.47, N = 3 19316 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 2048 Xeon Gold 5218 3K 6K 9K 12K 15K SE +/- 124.29, N = 3 15084 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Xeon Gold 5218 3K 6K 9K 12K 15K SE +/- 32.95, N = 3 12057 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
Tungsten Renderer Tungsten is a C++ physically based renderer that makes use of Intel's Embree ray tracing library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Hair Xeon Gold 5218 5 10 15 20 25 SE +/- 0.02, N = 3 22.37 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Water Caustic Xeon Gold 5218 6 12 18 24 30 SE +/- 0.14, N = 3 26.17 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Non-Exponential Xeon Gold 5218 2 4 6 8 10 SE +/- 0.10378, N = 15 8.27766 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Volumetric Caustic Xeon Gold 5218 3 6 9 12 15 SE +/- 0.05813, N = 3 9.50049 1. (CXX) g++ options: -std=c++0x -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mfma -mbmi2 -mavx512f -mavx512vl -mavx512cd -mavx512dq -mavx512bw -mno-sse4a -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512pf -mno-avx512er -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -ljpeg -lpthread -ldl
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Xeon Gold 5218 13 26 39 52 65 SE +/- 0.03, N = 3 57.51 1. (CC) gcc options: -lm -lpthread -O3
TTSIOD 3D Renderer A portable GPL 3D software renderer that supports OpenMP and Intel Threading Building Blocks with many different rendering modes. This version does not use OpenGL but is entirely CPU/software based. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better TTSIOD 3D Renderer 2.3b Phong Rendering With Soft-Shadow Mapping Xeon Gold 5218 100 200 300 400 500 SE +/- 0.95, N = 3 469.84 1. (CXX) g++ options: -O3 -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -fopenmp -fwhole-program -lstdc++
OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Pabellon Barcelona - Compute: OpenCL Xeon Gold 5218 300 600 900 1200 1500 SE +/- 2.80, N = 3 1412.20
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time Xeon Gold 5218 12 24 36 48 60 SE +/- 1.09, N = 15 52.30 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lSDL -lXpm -lSM -lICE -lX11 -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.6.1 Binary: Pathtracer ISPC - Model: Crown Xeon Gold 5218 4 8 12 16 20 SE +/- 0.01, N = 3 14.36 MIN: 14.21 / MAX: 14.55
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.6.1 Binary: Pathtracer - Model: Asian Dragon Xeon Gold 5218 4 8 12 16 20 SE +/- 0.02, N = 3 15.19 MIN: 15.11 / MAX: 15.32
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.6.1 Binary: Pathtracer - Model: Asian Dragon Obj Xeon Gold 5218 4 8 12 16 20 SE +/- 0.02, N = 3 13.79 MIN: 13.7 / MAX: 13.9
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.6.1 Binary: Pathtracer ISPC - Model: Asian Dragon Xeon Gold 5218 5 10 15 20 25 SE +/- 0.01, N = 3 19.26 MIN: 19.15 / MAX: 19.41
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.6.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj Xeon Gold 5218 4 8 12 16 20 SE +/- 0.01, N = 3 16.64 MIN: 16.53 / MAX: 16.8
Smallpt Smallpt is a C++ global illumination renderer written in less than 100 lines of code. Global illumination is done via unbiased Monte Carlo path tracing and there is multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 128 Samples Xeon Gold 5218 2 4 6 8 10 SE +/- 0.018, N = 3 8.702 1. (CXX) g++ options: -fopenmp -O3
OSPray Intel OSPray is a portable ray-tracing engine for high-performance, high-fidenlity scientific visualizations. OSPray builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: San Miguel - Renderer: SciVis Xeon Gold 5218 5 10 15 20 25 SE +/- 0.00, N = 12 18.87 MIN: 15.38 / MAX: 19.23
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: XFrog Forest - Renderer: SciVis Xeon Gold 5218 0.6773 1.3546 2.0319 2.7092 3.3865 SE +/- 0.00, N = 3 3.01 MIN: 2.81 / MAX: 3.02
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: San Miguel - Renderer: Path Tracer Xeon Gold 5218 0.3938 0.7876 1.1814 1.5752 1.969 SE +/- 0.00, N = 3 1.75 MIN: 1.7
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: XFrog Forest - Renderer: Path Tracer Xeon Gold 5218 0.3758 0.7516 1.1274 1.5032 1.879 SE +/- 0.00, N = 6 1.67 MIN: 1.61 / MAX: 1.68
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: Magnetic Reconnection - Renderer: SciVis Xeon Gold 5218 5 10 15 20 25 SE +/- 0.00, N = 12 21.74 MIN: 18.52 / MAX: 22.22
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: NASA Streamlines - Renderer: Path Tracer Xeon Gold 5218 0.9833 1.9666 2.9499 3.9332 4.9165 SE +/- 0.00, N = 12 4.37 MIN: 4.08 / MAX: 4.44
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: Magnetic Reconnection - Renderer: Path Tracer Xeon Gold 5218 70 140 210 280 350 SE +/- 0.00, N = 12 333.33 MIN: 200
Xeon Gold 5218 Processor: Intel Xeon Gold 5218 @ 3.90GHz (16 Cores / 32 Threads), Motherboard: Supermicro X11SPL-F v1.02 (3.1 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 188GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: llvmpipe 188GB, Monitor: VE228, Network: 2 x Intel I210
OS: Ubuntu 19.10, Kernel: 5.5.0-050500-generic (x86_64), Desktop: GNOME Shell 3.34.1, Display Server: X Server 1.20.5, Display Driver: modesetting 1.20.5, OpenGL: 3.3 Mesa 19.2.8 (LLVM 9.0 256 bits), Compiler: GCC 9.2.1 20191008, File-System: ext4, Screen Resolution: 1920x1080
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x500002cPython Notes: Python 2.7.17 + Python 3.7.5Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Mitigation of TSX disabled
Testing initiated at 19 February 2020 08:51 by user phoronix.