mkl-dnn + fftw Intel Core i5-10600K testing with a ASUS PRIME Z490M-PLUS (0603 BIOS) and Sapphire AMD Radeon RX 470/480/570/570X/580/580X/590 8GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2006163-NE-MKLDNNFFT18&grr&export=txt&sro .
mkl-dnn + fftw Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Intel Core i3-10100 Intel Core i5-10600K Intel Core i3-10100 @ 4.30GHz (4 Cores / 8 Threads) ASUS PRIME Z490M-PLUS (0603 BIOS) Intel Comet Lake PCH 16GB 240GB Force MP510 + 2000GB Samsung SSD 860 Sapphire AMD Radeon RX 470/480/570/570X/580/580X/590 8GB (1560/2100MHz) Realtek ALC887-VD ASUS MG28U Intel Ubuntu 20.04 5.7.0-rc6-amd-energy (x86_64) 20200527 GNOME Shell 3.36.2 X Server 1.20.8 modesetting 1.20.8 4.6 Mesa 20.0.4 (LLVM 9.0.1) GCC 9.3.0 ext4 3840x2160 Intel Core i5-10600K @ 4.80GHz (6 Cores / 12 Threads) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Intel Core i3-10100: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xcc - Intel Core i5-10600K: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xc8 Security Details - itlb_multihit: KVM: Vulnerable + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Not affected
mkl-dnn + fftw fftw: Float + SSE - 2D FFT Size 4096 fftw: Stock - 2D FFT Size 4096 fftw: Float + SSE - 2D FFT Size 2048 mkl-dnn: IP Batch All - f32 mkl-dnn: IP Batch All - u8s8f32 fftw: Stock - 2D FFT Size 2048 mkl-dnn: Recurrent Neural Network Training - f32 mkl-dnn: Recurrent Neural Network Inference - f32 mkl-dnn: Deconvolution Batch deconv_1d - u8s8f32 mkl-dnn: Deconvolution Batch deconv_1d - f32 fftw: Float + SSE - 2D FFT Size 1024 mkl-dnn: IP Batch 1D - f32 mkl-dnn: IP Batch 1D - u8s8f32 fftw: Stock - 2D FFT Size 1024 fftw: Float + SSE - 1D FFT Size 4096 fftw: Float + SSE - 2D FFT Size 512 fftw: Float + SSE - 1D FFT Size 2048 fftw: Stock - 1D FFT Size 4096 fftw: Float + SSE - 1D FFT Size 1024 fftw: Float + SSE - 2D FFT Size 128 fftw: Float + SSE - 2D FFT Size 256 fftw: Float + SSE - 1D FFT Size 64 fftw: Float + SSE - 1D FFT Size 256 fftw: Float + SSE - 1D FFT Size 512 fftw: Stock - 2D FFT Size 128 fftw: Stock - 2D FFT Size 256 fftw: Stock - 1D FFT Size 2048 fftw: Stock - 2D FFT Size 512 fftw: Stock - 1D FFT Size 128 fftw: Float + SSE - 2D FFT Size 32 fftw: Stock - 1D FFT Size 512 mkl-dnn: Deconvolution Batch deconv_3d - u8s8f32 fftw: Stock - 1D FFT Size 256 mkl-dnn: Deconvolution Batch deconv_3d - f32 fftw: Stock - 2D FFT Size 32 fftw: Stock - 1D FFT Size 1024 fftw: Stock - 2D FFT Size 64 fftw: Stock - 1D FFT Size 64 fftw: Stock - 1D FFT Size 32 fftw: Float + SSE - 1D FFT Size 128 fftw: Float + SSE - 1D FFT Size 32 fftw: Float + SSE - 2D FFT Size 64 Intel Core i3-10100 Intel Core i5-10600K 24354 5964.1 26279 101.923 42.9357 6258.6 428.081 113.142 318.859 8.42008 29909 7.32233 3.13627 6657.9 44598 34677 47673 7895.1 47510 33950 32998 21112 35177 44105 7932.2 7520.5 8070.1 7495.8 7968.3 49207 8307.4 6.47082 8153.3 12.5169 9711.3 8449.9 8257.5 8482.1 8698.8 25205 16975 43885 22767 6334.2 26053 74.6247 30.0636 6548.7 285.862 52.0325 222.388 5.46402 37698 4.54913 2.02339 7477.3 49493 38709 53506 8752.5 53314 38420 36522 23300 38866 49009 8708.8 8467.0 8965.5 8434.5 9205.0 55199 9436.7 4.41247 9197.2 8.48212 10859 9478.6 9279.0 9457.0 9869.1 27902 18442 49221 OpenBenchmarking.org
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Intel Core i3-10100 Intel Core i5-10600K 5K 10K 15K 20K 25K SE +/- 129.05, N = 3 SE +/- 191.44, N = 3 24354 22767 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Intel Core i3-10100 Intel Core i5-10600K 1400 2800 4200 5600 7000 SE +/- 3.56, N = 3 SE +/- 28.65, N = 3 5964.1 6334.2 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 2048 Intel Core i3-10100 Intel Core i5-10600K 6K 12K 18K 24K 30K SE +/- 191.43, N = 3 SE +/- 130.95, N = 3 26279 26053 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN MKL-DNN Harness: IP Batch All - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch All - Data Type: f32 Intel Core i3-10100 Intel Core i5-10600K 20 40 60 80 100 SE +/- 0.11, N = 3 SE +/- 0.06, N = 3 101.92 74.62 MIN: 100.36 MIN: 73.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: IP Batch All - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch All - Data Type: u8s8f32 Intel Core i3-10100 Intel Core i5-10600K 10 20 30 40 50 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 42.94 30.06 MIN: 42.52 MIN: 29.71 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW Build: Stock - Size: 2D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 2048 Intel Core i3-10100 Intel Core i5-10600K 1400 2800 4200 5600 7000 SE +/- 18.15, N = 3 SE +/- 11.05, N = 3 6258.6 6548.7 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN MKL-DNN Harness: Recurrent Neural Network Training - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Recurrent Neural Network Training - Data Type: f32 Intel Core i3-10100 Intel Core i5-10600K 90 180 270 360 450 SE +/- 0.32, N = 3 SE +/- 0.49, N = 3 428.08 285.86 MIN: 426.8 MIN: 283.7 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: Recurrent Neural Network Inference - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Recurrent Neural Network Inference - Data Type: f32 Intel Core i3-10100 Intel Core i5-10600K 30 60 90 120 150 SE +/- 0.25, N = 3 SE +/- 0.22, N = 3 113.14 52.03 MIN: 112.24 MIN: 50.7 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 Intel Core i3-10100 Intel Core i5-10600K 70 140 210 280 350 SE +/- 3.88, N = 3 SE +/- 3.25, N = 3 318.86 222.39 MIN: 313.59 MIN: 216.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_1d - Data Type: f32 Intel Core i3-10100 Intel Core i5-10600K 2 4 6 8 10 SE +/- 0.01195, N = 3 SE +/- 0.01618, N = 3 8.42008 5.46402 MIN: 8.36 MIN: 5.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW Build: Float + SSE - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 1024 Intel Core i3-10100 Intel Core i5-10600K 8K 16K 24K 32K 40K SE +/- 132.83, N = 3 SE +/- 111.44, N = 3 29909 37698 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN MKL-DNN Harness: IP Batch 1D - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch 1D - Data Type: f32 Intel Core i3-10100 Intel Core i5-10600K 2 4 6 8 10 SE +/- 0.02171, N = 3 SE +/- 0.02093, N = 3 7.32233 4.54913 MIN: 6.83 MIN: 4.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: IP Batch 1D - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch 1D - Data Type: u8s8f32 Intel Core i3-10100 Intel Core i5-10600K 0.7057 1.4114 2.1171 2.8228 3.5285 SE +/- 0.00313, N = 3 SE +/- 0.00326, N = 3 3.13627 2.02339 MIN: 3.11 MIN: 1.97 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW Build: Stock - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 1024 Intel Core i3-10100 Intel Core i5-10600K 1600 3200 4800 6400 8000 SE +/- 6.50, N = 3 SE +/- 83.43, N = 3 6657.9 7477.3 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 Intel Core i3-10100 Intel Core i5-10600K 11K 22K 33K 44K 55K SE +/- 574.55, N = 5 SE +/- 510.41, N = 3 44598 49493 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 512 Intel Core i3-10100 Intel Core i5-10600K 8K 16K 24K 32K 40K SE +/- 33.93, N = 3 SE +/- 165.77, N = 3 34677 38709 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 2048 Intel Core i3-10100 Intel Core i5-10600K 11K 22K 33K 44K 55K SE +/- 63.72, N = 3 SE +/- 514.62, N = 3 47673 53506 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 34.60, N = 3 SE +/- 38.60, N = 3 7895.1 8752.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 1024 Intel Core i3-10100 Intel Core i5-10600K 11K 22K 33K 44K 55K SE +/- 219.08, N = 3 SE +/- 58.54, N = 3 47510 53314 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 128 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 128 Intel Core i3-10100 Intel Core i5-10600K 8K 16K 24K 32K 40K SE +/- 462.07, N = 4 SE +/- 493.36, N = 3 33950 38420 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 256 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 256 Intel Core i3-10100 Intel Core i5-10600K 8K 16K 24K 32K 40K SE +/- 80.84, N = 3 SE +/- 47.79, N = 3 32998 36522 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 64 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 64 Intel Core i3-10100 Intel Core i5-10600K 5K 10K 15K 20K 25K SE +/- 204.70, N = 3 SE +/- 243.60, N = 8 21112 23300 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 256 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 256 Intel Core i3-10100 Intel Core i5-10600K 8K 16K 24K 32K 40K SE +/- 197.03, N = 3 SE +/- 465.25, N = 3 35177 38866 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 512 Intel Core i3-10100 Intel Core i5-10600K 10K 20K 30K 40K 50K SE +/- 88.21, N = 3 SE +/- 212.58, N = 3 44105 49009 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 128 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 128 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 7.05, N = 3 SE +/- 61.31, N = 3 7932.2 8708.8 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 256 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 256 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 59.45, N = 3 SE +/- 15.71, N = 3 7520.5 8467.0 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 2048 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 26.75, N = 3 SE +/- 36.66, N = 3 8070.1 8965.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 512 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 8.90, N = 3 SE +/- 77.07, N = 3 7495.8 8434.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 128 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 128 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 12.44, N = 3 SE +/- 20.43, N = 3 7968.3 9205.0 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 Intel Core i3-10100 Intel Core i5-10600K 12K 24K 36K 48K 60K SE +/- 315.98, N = 3 SE +/- 219.30, N = 3 49207 55199 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 512 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 84.60, N = 3 SE +/- 10.79, N = 3 8307.4 9436.7 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 Intel Core i3-10100 Intel Core i5-10600K 2 4 6 8 10 SE +/- 0.00655, N = 3 SE +/- 0.00447, N = 3 6.47082 4.41247 MIN: 6.43 MIN: 4.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW Build: Stock - Size: 1D FFT Size 256 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 256 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 34.58, N = 3 SE +/- 7.65, N = 3 8153.3 9197.2 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_3d - Data Type: f32 Intel Core i3-10100 Intel Core i5-10600K 3 6 9 12 15 SE +/- 0.00097, N = 3 SE +/- 0.00750, N = 3 12.51690 8.48212 MIN: 12.34 MIN: 8.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW Build: Stock - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 32 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 26.94, N = 3 SE +/- 35.35, N = 3 9711.3 10859.0 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 1024 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 8.11, N = 3 SE +/- 13.72, N = 3 8449.9 9478.6 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 64 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 64 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 23.23, N = 3 SE +/- 34.34, N = 3 8257.5 9279.0 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 64 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 64 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 25.76, N = 3 SE +/- 73.01, N = 3 8482.1 9457.0 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 Intel Core i3-10100 Intel Core i5-10600K 2K 4K 6K 8K 10K SE +/- 125.21, N = 4 SE +/- 30.95, N = 3 8698.8 9869.1 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 128 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 128 Intel Core i3-10100 Intel Core i5-10600K 6K 12K 18K 24K 30K SE +/- 248.24, N = 3 SE +/- 349.46, N = 3 25205 27902 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 Intel Core i3-10100 Intel Core i5-10600K 4K 8K 12K 16K 20K SE +/- 30.85, N = 3 SE +/- 278.43, N = 3 16975 18442 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 64 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 64 Intel Core i3-10100 Intel Core i5-10600K 11K 22K 33K 44K 55K SE +/- 600.09, N = 3 SE +/- 661.09, N = 3 43885 49221 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
Phoronix Test Suite v10.8.5