mkl-dnn + fftw Intel Core i5-10600K testing with a ASUS PRIME Z490M-PLUS (0603 BIOS) and Sapphire AMD Radeon RX 470/480/570/570X/580/580X/590 8GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2006163-NE-MKLDNNFFT18&grs&sor .
mkl-dnn + fftw Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Intel Core i3-10100 Intel Core i5-10600K Intel Core i3-10100 @ 4.30GHz (4 Cores / 8 Threads) ASUS PRIME Z490M-PLUS (0603 BIOS) Intel Comet Lake PCH 16GB 240GB Force MP510 + 2000GB Samsung SSD 860 Sapphire AMD Radeon RX 470/480/570/570X/580/580X/590 8GB (1560/2100MHz) Realtek ALC887-VD ASUS MG28U Intel Ubuntu 20.04 5.7.0-rc6-amd-energy (x86_64) 20200527 GNOME Shell 3.36.2 X Server 1.20.8 modesetting 1.20.8 4.6 Mesa 20.0.4 (LLVM 9.0.1) GCC 9.3.0 ext4 3840x2160 Intel Core i5-10600K @ 4.80GHz (6 Cores / 12 Threads) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Intel Core i3-10100: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xcc - Intel Core i5-10600K: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xc8 Security Details - itlb_multihit: KVM: Vulnerable + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + tsx_async_abort: Not affected
mkl-dnn + fftw mkl-dnn: Recurrent Neural Network Inference - f32 mkl-dnn: IP Batch 1D - f32 mkl-dnn: IP Batch 1D - u8s8f32 mkl-dnn: Deconvolution Batch deconv_1d - f32 mkl-dnn: Recurrent Neural Network Training - f32 mkl-dnn: Deconvolution Batch deconv_3d - f32 mkl-dnn: Deconvolution Batch deconv_3d - u8s8f32 mkl-dnn: Deconvolution Batch deconv_1d - u8s8f32 mkl-dnn: IP Batch All - u8s8f32 mkl-dnn: IP Batch All - f32 fftw: Float + SSE - 2D FFT Size 1024 fftw: Stock - 1D FFT Size 128 fftw: Stock - 1D FFT Size 512 fftw: Stock - 1D FFT Size 32 fftw: Float + SSE - 2D FFT Size 128 fftw: Stock - 1D FFT Size 256 fftw: Stock - 2D FFT Size 256 fftw: Stock - 2D FFT Size 512 fftw: Stock - 2D FFT Size 64 fftw: Stock - 2D FFT Size 1024 fftw: Float + SSE - 1D FFT Size 2048 fftw: Float + SSE - 1D FFT Size 1024 fftw: Float + SSE - 2D FFT Size 32 fftw: Stock - 1D FFT Size 1024 fftw: Float + SSE - 2D FFT Size 64 fftw: Stock - 2D FFT Size 32 fftw: Float + SSE - 2D FFT Size 512 fftw: Stock - 1D FFT Size 64 fftw: Float + SSE - 1D FFT Size 512 fftw: Stock - 1D FFT Size 2048 fftw: Float + SSE - 1D FFT Size 4096 fftw: Stock - 1D FFT Size 4096 fftw: Float + SSE - 1D FFT Size 128 fftw: Float + SSE - 2D FFT Size 256 fftw: Float + SSE - 1D FFT Size 256 fftw: Float + SSE - 1D FFT Size 64 fftw: Stock - 2D FFT Size 128 fftw: Float + SSE - 1D FFT Size 32 fftw: Float + SSE - 2D FFT Size 4096 fftw: Stock - 2D FFT Size 4096 fftw: Stock - 2D FFT Size 2048 fftw: Float + SSE - 2D FFT Size 2048 Intel Core i3-10100 Intel Core i5-10600K 113.142 7.32233 3.13627 8.42008 428.081 12.5169 6.47082 318.859 42.9357 101.923 29909 7968.3 8307.4 8698.8 33950 8153.3 7520.5 7495.8 8257.5 6657.9 47673 47510 49207 8449.9 43885 9711.3 34677 8482.1 44105 8070.1 44598 7895.1 25205 32998 35177 21112 7932.2 16975 24354 5964.1 6258.6 26279 52.0325 4.54913 2.02339 5.46402 285.862 8.48212 4.41247 222.388 30.0636 74.6247 37698 9205.0 9436.7 9869.1 38420 9197.2 8467.0 8434.5 9279.0 7477.3 53506 53314 55199 9478.6 49221 10859 38709 9457.0 49009 8965.5 49493 8752.5 27902 36522 38866 23300 8708.8 18442 22767 6334.2 6548.7 26053 OpenBenchmarking.org
oneDNN MKL-DNN Harness: Recurrent Neural Network Inference - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Recurrent Neural Network Inference - Data Type: f32 Intel Core i5-10600K Intel Core i3-10100 30 60 90 120 150 SE +/- 0.22, N = 3 SE +/- 0.25, N = 3 52.03 113.14 MIN: 50.7 MIN: 112.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: IP Batch 1D - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch 1D - Data Type: f32 Intel Core i5-10600K Intel Core i3-10100 2 4 6 8 10 SE +/- 0.02093, N = 3 SE +/- 0.02171, N = 3 4.54913 7.32233 MIN: 4.44 MIN: 6.83 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: IP Batch 1D - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch 1D - Data Type: u8s8f32 Intel Core i5-10600K Intel Core i3-10100 0.7057 1.4114 2.1171 2.8228 3.5285 SE +/- 0.00326, N = 3 SE +/- 0.00313, N = 3 2.02339 3.13627 MIN: 1.97 MIN: 3.11 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_1d - Data Type: f32 Intel Core i5-10600K Intel Core i3-10100 2 4 6 8 10 SE +/- 0.01618, N = 3 SE +/- 0.01195, N = 3 5.46402 8.42008 MIN: 5.36 MIN: 8.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: Recurrent Neural Network Training - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Recurrent Neural Network Training - Data Type: f32 Intel Core i5-10600K Intel Core i3-10100 90 180 270 360 450 SE +/- 0.49, N = 3 SE +/- 0.32, N = 3 285.86 428.08 MIN: 283.7 MIN: 426.8 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_3d - Data Type: f32 Intel Core i5-10600K Intel Core i3-10100 3 6 9 12 15 SE +/- 0.00750, N = 3 SE +/- 0.00097, N = 3 8.48212 12.51690 MIN: 8.37 MIN: 12.34 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 Intel Core i5-10600K Intel Core i3-10100 2 4 6 8 10 SE +/- 0.00447, N = 3 SE +/- 0.00655, N = 3 4.41247 6.47082 MIN: 4.37 MIN: 6.43 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 Intel Core i5-10600K Intel Core i3-10100 70 140 210 280 350 SE +/- 3.25, N = 3 SE +/- 3.88, N = 3 222.39 318.86 MIN: 216.65 MIN: 313.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: IP Batch All - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch All - Data Type: u8s8f32 Intel Core i5-10600K Intel Core i3-10100 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 30.06 42.94 MIN: 29.71 MIN: 42.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
oneDNN MKL-DNN Harness: IP Batch All - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch All - Data Type: f32 Intel Core i5-10600K Intel Core i3-10100 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 74.62 101.92 MIN: 73.65 MIN: 100.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
FFTW Build: Float + SSE - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 1024 Intel Core i5-10600K Intel Core i3-10100 8K 16K 24K 32K 40K SE +/- 111.44, N = 3 SE +/- 132.83, N = 3 37698 29909 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 128 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 128 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 20.43, N = 3 SE +/- 12.44, N = 3 9205.0 7968.3 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 512 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 10.79, N = 3 SE +/- 84.60, N = 3 9436.7 8307.4 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 30.95, N = 3 SE +/- 125.21, N = 4 9869.1 8698.8 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 128 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 128 Intel Core i5-10600K Intel Core i3-10100 8K 16K 24K 32K 40K SE +/- 493.36, N = 3 SE +/- 462.07, N = 4 38420 33950 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 256 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 256 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 7.65, N = 3 SE +/- 34.58, N = 3 9197.2 8153.3 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 256 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 256 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 15.71, N = 3 SE +/- 59.45, N = 3 8467.0 7520.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 512 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 77.07, N = 3 SE +/- 8.90, N = 3 8434.5 7495.8 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 64 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 64 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 34.34, N = 3 SE +/- 23.23, N = 3 9279.0 8257.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 1024 Intel Core i5-10600K Intel Core i3-10100 1600 3200 4800 6400 8000 SE +/- 83.43, N = 3 SE +/- 6.50, N = 3 7477.3 6657.9 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 2048 Intel Core i5-10600K Intel Core i3-10100 11K 22K 33K 44K 55K SE +/- 514.62, N = 3 SE +/- 63.72, N = 3 53506 47673 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 1024 Intel Core i5-10600K Intel Core i3-10100 11K 22K 33K 44K 55K SE +/- 58.54, N = 3 SE +/- 219.08, N = 3 53314 47510 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 Intel Core i5-10600K Intel Core i3-10100 12K 24K 36K 48K 60K SE +/- 219.30, N = 3 SE +/- 315.98, N = 3 55199 49207 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 1024 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 13.72, N = 3 SE +/- 8.11, N = 3 9478.6 8449.9 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 64 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 64 Intel Core i5-10600K Intel Core i3-10100 11K 22K 33K 44K 55K SE +/- 661.09, N = 3 SE +/- 600.09, N = 3 49221 43885 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 32 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 35.35, N = 3 SE +/- 26.94, N = 3 10859.0 9711.3 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 512 Intel Core i5-10600K Intel Core i3-10100 8K 16K 24K 32K 40K SE +/- 165.77, N = 3 SE +/- 33.93, N = 3 38709 34677 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 64 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 64 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 73.01, N = 3 SE +/- 25.76, N = 3 9457.0 8482.1 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 512 Intel Core i5-10600K Intel Core i3-10100 10K 20K 30K 40K 50K SE +/- 212.58, N = 3 SE +/- 88.21, N = 3 49009 44105 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 2048 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 36.66, N = 3 SE +/- 26.75, N = 3 8965.5 8070.1 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 Intel Core i5-10600K Intel Core i3-10100 11K 22K 33K 44K 55K SE +/- 510.41, N = 3 SE +/- 574.55, N = 5 49493 44598 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 38.60, N = 3 SE +/- 34.60, N = 3 8752.5 7895.1 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 128 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 128 Intel Core i5-10600K Intel Core i3-10100 6K 12K 18K 24K 30K SE +/- 349.46, N = 3 SE +/- 248.24, N = 3 27902 25205 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 256 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 256 Intel Core i5-10600K Intel Core i3-10100 8K 16K 24K 32K 40K SE +/- 47.79, N = 3 SE +/- 80.84, N = 3 36522 32998 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 256 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 256 Intel Core i5-10600K Intel Core i3-10100 8K 16K 24K 32K 40K SE +/- 465.25, N = 3 SE +/- 197.03, N = 3 38866 35177 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 64 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 64 Intel Core i5-10600K Intel Core i3-10100 5K 10K 15K 20K 25K SE +/- 243.60, N = 8 SE +/- 204.70, N = 3 23300 21112 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 128 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 128 Intel Core i5-10600K Intel Core i3-10100 2K 4K 6K 8K 10K SE +/- 61.31, N = 3 SE +/- 7.05, N = 3 8708.8 7932.2 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 Intel Core i5-10600K Intel Core i3-10100 4K 8K 12K 16K 20K SE +/- 278.43, N = 3 SE +/- 30.85, N = 3 18442 16975 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Intel Core i3-10100 Intel Core i5-10600K 5K 10K 15K 20K 25K SE +/- 129.05, N = 3 SE +/- 191.44, N = 3 24354 22767 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Intel Core i5-10600K Intel Core i3-10100 1400 2800 4200 5600 7000 SE +/- 28.65, N = 3 SE +/- 3.56, N = 3 6334.2 5964.1 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Stock - Size: 2D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 2048 Intel Core i5-10600K Intel Core i3-10100 1400 2800 4200 5600 7000 SE +/- 11.05, N = 3 SE +/- 18.15, N = 3 6548.7 6258.6 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 2048 Intel Core i3-10100 Intel Core i5-10600K 6K 12K 18K 24K 30K SE +/- 191.43, N = 3 SE +/- 130.95, N = 3 26279 26053 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
Phoronix Test Suite v10.8.5