AMD Ryzen 5 5600X 6-Core testing with a ASRock X570 Phantom Gaming-ITX/TB3 (P3.00 BIOS) and NVIDIA GeForce RTX 3090 24GB on Ubuntu 18.04 via the Phoronix Test Suite.
NVIDIA GeForce RTX 3090 Processor: AMD Ryzen 5 5600X 6-Core @ 3.70GHz (6 Cores / 12 Threads), Motherboard: ASRock X570 Phantom Gaming-ITX/TB3 (P3.00 BIOS), Chipset: AMD Device 1480, Memory: 64GB, Disk: 2000GB Samsung SSD 970 EVO Plus 2TB + 4001GB Samsung SSD 870 + ProductCode, Graphics: NVIDIA GeForce RTX 3090 24GB, Audio: NVIDIA Device 1aef, Monitor: marantz-AVR, Network: Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.4.0-70-generic (x86_64), Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.8, Display Driver: NVIDIA 460.32.03, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 11.2.109, Vulkan: 1.2.155, Compiler: GCC 7.5.0 + CUDA 11.2, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa201009OpenCL Notes: GPU Compute Cores: 10496Python Notes: Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial timeseries data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint NVIDIA GeForce RTX 3090 10 20 30 40 50 SE +/- 0.35, N = 3 43.28
OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.17, N = 3 10.99
OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 NVIDIA GeForce RTX 3090 50 100 150 200 250 SE +/- 0.11, N = 3 213.48 MIN: 212.57 / MAX: 222.67 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.07, N = 3 16.93 MIN: 16.52 / MAX: 24.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny NVIDIA GeForce RTX 3090 5 10 15 20 25 SE +/- 0.02, N = 3 22.63 MIN: 21.64 / MAX: 47.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 NVIDIA GeForce RTX 3090 6 12 18 24 30 SE +/- 0.04, N = 3 27.56 MIN: 26.94 / MAX: 58.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.18, N = 3 13.42 MIN: 12.87 / MAX: 23.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.02, N = 3 15.11 MIN: 14.69 / MAX: 24.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 NVIDIA GeForce RTX 3090 13 26 39 52 65 SE +/- 0.32, N = 3 55.85 MIN: 54.73 / MAX: 68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.06, N = 3 12.97 MIN: 12.29 / MAX: 24.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 NVIDIA GeForce RTX 3090 1.1453 2.2906 3.4359 4.5812 5.7265 SE +/- 0.09, N = 3 5.09 MIN: 4.94 / MAX: 9.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet NVIDIA GeForce RTX 3090 0.7718 1.5436 2.3154 3.0872 3.859 SE +/- 0.07, N = 3 3.43 MIN: 3.3 / MAX: 29.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 NVIDIA GeForce RTX 3090 0.9698 1.9396 2.9094 3.8792 4.849 SE +/- 0.07, N = 3 4.31 MIN: 4.11 / MAX: 22.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 NVIDIA GeForce RTX 3090 0.7155 1.431 2.1465 2.862 3.5775 SE +/- 0.04, N = 3 3.18 MIN: 3.03 / MAX: 20.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 NVIDIA GeForce RTX 3090 0.828 1.656 2.484 3.312 4.14 SE +/- 0.01, N = 3 3.68 MIN: 3.6 / MAX: 8.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.08, N = 3 14.18 MIN: 13.89 / MAX: 18.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.08, N = 4 9.73 MIN: 9.45 / MAX: 34.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.02, N = 4 16.70 MIN: 16.4 / MAX: 35.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny NVIDIA GeForce RTX 3090 5 10 15 20 25 SE +/- 0.16, N = 4 22.15 MIN: 21.24 / MAX: 31.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 NVIDIA GeForce RTX 3090 6 12 18 24 30 SE +/- 0.10, N = 4 27.47 MIN: 26.81 / MAX: 52.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.11, N = 4 13.48 MIN: 12.81 / MAX: 24.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.04, N = 4 15.15 MIN: 14.72 / MAX: 25.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 NVIDIA GeForce RTX 3090 13 26 39 52 65 SE +/- 0.16, N = 4 55.98 MIN: 54.46 / MAX: 67.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.01, N = 4 13.13 MIN: 12.48 / MAX: 23.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface NVIDIA GeForce RTX 3090 0.315 0.63 0.945 1.26 1.575 SE +/- 0.03, N = 4 1.40 MIN: 1.33 / MAX: 5.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 NVIDIA GeForce RTX 3090 1.2083 2.4166 3.6249 4.8332 6.0415 SE +/- 0.11, N = 4 5.37 MIN: 5 / MAX: 19.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet NVIDIA GeForce RTX 3090 0.7898 1.5796 2.3694 3.1592 3.949 SE +/- 0.04, N = 4 3.51 MIN: 3.32 / MAX: 22.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 NVIDIA GeForce RTX 3090 0.99 1.98 2.97 3.96 4.95 SE +/- 0.04, N = 4 4.40 MIN: 4.29 / MAX: 29.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 NVIDIA GeForce RTX 3090 0.6998 1.3996 2.0994 2.7992 3.499 SE +/- 0.01, N = 4 3.11 MIN: 3.06 / MAX: 8.02 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 NVIDIA GeForce RTX 3090 0.828 1.656 2.484 3.312 4.14 SE +/- 0.01, N = 4 3.68 MIN: 3.58 / MAX: 8.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.22, N = 4 14.22 MIN: 13.84 / MAX: 23.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Mobile Neural Network MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 NVIDIA GeForce RTX 3090 7 14 21 28 35 SE +/- 0.05, N = 3 28.19 MIN: 27.64 / MAX: 58.71 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 NVIDIA GeForce RTX 3090 0.6152 1.2304 1.8456 2.4608 3.076 SE +/- 0.007, N = 3 2.734 MIN: 2.67 / MAX: 7.63 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 NVIDIA GeForce RTX 3090 0.4723 0.9446 1.4169 1.8892 2.3615 SE +/- 0.002, N = 3 2.099 MIN: 2.05 / MAX: 6.64 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 NVIDIA GeForce RTX 3090 6 12 18 24 30 SE +/- 0.06, N = 3 23.68 MIN: 23.08 / MAX: 54.3 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 NVIDIA GeForce RTX 3090 0.9322 1.8644 2.7966 3.7288 4.661 SE +/- 0.044, N = 3 4.143 MIN: 3.94 / MAX: 60.31 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
RNNoise RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.02, N = 3 15.97 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
DeepSpeech Mozilla DeepSpeech is a speech-to-text engine powered by TensorFlow for machine learning and derived from Baidu's Deep Speech research paper. This test profile times the speech-to-text process for a roughly three minute audio recording. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU NVIDIA GeForce RTX 3090 14 28 42 56 70 SE +/- 0.22, N = 3 63.93
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.796 1.592 2.388 3.184 3.98 SE +/- 0.00554, N = 3 3.53795 MIN: 3.28 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU NVIDIA GeForce RTX 3090 500 1000 1500 2000 2500 SE +/- 2.08, N = 3 2109.64 MIN: 2084.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU NVIDIA GeForce RTX 3090 900 1800 2700 3600 4500 SE +/- 3.64, N = 3 4001.99 MIN: 3964.95 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.5077 1.0154 1.5231 2.0308 2.5385 SE +/- 0.00794, N = 3 2.25636 MIN: 2.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 500 1000 1500 2000 2500 SE +/- 27.02, N = 3 2135.24 MIN: 2083.3 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 900 1800 2700 3600 4500 SE +/- 5.89, N = 3 3999.33 MIN: 3962.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 500 1000 1500 2000 2500 SE +/- 1.73, N = 3 2107.13 MIN: 2086.2 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 900 1800 2700 3600 4500 SE +/- 3.18, N = 3 4007.49 MIN: 3971.88 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.9837 1.9674 2.9511 3.9348 4.9185 SE +/- 0.00941, N = 3 4.37194 MIN: 4.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.5882 1.1764 1.7646 2.3528 2.941 SE +/- 0.00540, N = 3 2.61408 MIN: 2.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.20, N = 4 12.81 MIN: 12.25 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 2 4 6 8 10 SE +/- 0.02918, N = 3 8.33697 MIN: 8.01 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.02, N = 3 10.54 MIN: 6.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.02, N = 3 13.54 MIN: 13.26 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.3312 0.6624 0.9936 1.3248 1.656 SE +/- 0.00109, N = 3 1.47217 MIN: 1.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.4489 0.8978 1.3467 1.7956 2.2445 SE +/- 0.01040, N = 3 1.99499 MIN: 1.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 2 4 6 8 10 SE +/- 0.02249, N = 3 7.09188 MIN: 6.94 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 1.0446 2.0892 3.1338 4.1784 5.223 SE +/- 0.01355, N = 3 4.64288 MIN: 4.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SHOC Scalable HeterOgeneous Computing The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth NVIDIA GeForce RTX 3090 500 1000 1500 2000 2500 SE +/- 0.75, N = 3 2156.05 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
NVIDIA GeForce RTX 3090 Processor: AMD Ryzen 5 5600X 6-Core @ 3.70GHz (6 Cores / 12 Threads), Motherboard: ASRock X570 Phantom Gaming-ITX/TB3 (P3.00 BIOS), Chipset: AMD Device 1480, Memory: 64GB, Disk: 2000GB Samsung SSD 970 EVO Plus 2TB + 4001GB Samsung SSD 870 + ProductCode, Graphics: NVIDIA GeForce RTX 3090 24GB, Audio: NVIDIA Device 1aef, Monitor: marantz-AVR, Network: Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.4.0-70-generic (x86_64), Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.8, Display Driver: NVIDIA 460.32.03, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 11.2.109, Vulkan: 1.2.155, Compiler: GCC 7.5.0 + CUDA 11.2, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa201009OpenCL Notes: GPU Compute Cores: 10496Python Notes: Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 29 March 2021 23:38 by user brw.