installres1 AMD Ryzen 5 5600X 6-Core testing with a ASRock X570 Phantom Gaming-ITX/TB3 (P3.00 BIOS) and NVIDIA GeForce RTX 3090 24GB on Ubuntu 18.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2103310-HA-INSTALLRE08 NVIDIA GeForce RTX 3090 Processor: AMD Ryzen 5 5600X 6-Core @ 3.70GHz (6 Cores / 12 Threads), Motherboard: ASRock X570 Phantom Gaming-ITX/TB3 (P3.00 BIOS), Chipset: AMD Device 1480, Memory: 64GB, Disk: 2000GB Samsung SSD 970 EVO Plus 2TB + 4001GB Samsung SSD 870 + ProductCode, Graphics: NVIDIA GeForce RTX 3090 24GB, Audio: NVIDIA Device 1aef, Monitor: marantz-AVR, Network: Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.4.0-70-generic (x86_64), Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.8, Display Driver: NVIDIA 460.32.03, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 11.2.109, Vulkan: 1.2.155, Compiler: GCC 7.5.0 + CUDA 11.2, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa201009OpenCL Notes: GPU Compute Cores: 10496Python Notes: Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
installres1 OpenBenchmarking.org Phoronix Test Suite AMD Ryzen 5 5600X 6-Core @ 3.70GHz (6 Cores / 12 Threads) ASRock X570 Phantom Gaming-ITX/TB3 (P3.00 BIOS) AMD Device 1480 64GB 2000GB Samsung SSD 970 EVO Plus 2TB + 4001GB Samsung SSD 870 + ProductCode NVIDIA GeForce RTX 3090 24GB NVIDIA Device 1aef marantz-AVR Intel I211 + Intel Device 2723 Ubuntu 18.04 5.4.0-70-generic (x86_64) GNOME Shell 3.28.4 X Server 1.20.8 NVIDIA 460.32.03 4.6.0 OpenCL 1.2 CUDA 11.2.109 1.2.155 GCC 7.5.0 + CUDA 11.2 ext4 1920x1080 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution Installres1 Benchmarks System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa201009 - GPU Compute Cores: 10496 - Python 3.8.5 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
installres1 plaidml: No - Inference - VGG16 - CPU plaidml: No - Inference - ResNet 50 - CPU shoc: OpenCL - Triad shoc: OpenCL - Reduction shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth shoc: OpenCL - S3D shoc: OpenCL - FFT SP shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Max SP Flops shoc: OpenCL - MD5 Hash numpy: ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 deepspeech: CPU rnnoise: ecp-candle: P1B2 ecp-candle: P3B1 ecp-candle: P3B2 numenta-nab: EXPoSE numenta-nab: Relative Entropy numenta-nab: Windowed Gaussian numenta-nab: Earthgecko Skyline numenta-nab: Bayesian Changepoint mlpack: scikit_ica mlpack: scikit_qda mlpack: scikit_svm mlpack: scikit_linearridgeregression scikit-learn: NVIDIA GeForce RTX 3090 10.99 9.40 12.8371 383.062 13.0704 13.1776 2156.05 428.140 2346.70 8373.79 39360.4 42.9143 466.33 1111 1163 2274 234227 3457463 192684 159375 174205 3121703 4.64288 7.09188 1.99499 1.47217 13.5351 10.5388 8.33697 12.8089 2.61408 4.37194 4007.49 2107.13 3999.33 2135.24 2.25636 4001.99 2109.64 3.53795 4.143 23.676 2.099 2.734 28.194 14.22 3.68 3.11 4.40 3.51 5.37 1.40 13.13 55.98 15.15 13.48 27.47 22.15 16.70 9.73 14.18 3.68 3.18 4.31 3.43 5.09 1.40 12.97 55.85 15.11 13.42 27.56 22.63 16.93 9.62 213.475 214.405 63.92980 15.974 31.596 896.151 453.131 319.329 19.078 13.084 141.338 43.282 33.52 120.45 17.81 2.43 8.735 OpenBenchmarking.org
OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.03, N = 3 9.40
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 1.0446 2.0892 3.1338 4.1784 5.223 SE +/- 0.01355, N = 3 4.64288 MIN: 4.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 2 4 6 8 10 SE +/- 0.02249, N = 3 7.09188 MIN: 6.94 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.4489 0.8978 1.3467 1.7956 2.2445 SE +/- 0.01040, N = 3 1.99499 MIN: 1.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.3312 0.6624 0.9936 1.3248 1.656 SE +/- 0.00109, N = 3 1.47217 MIN: 1.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.02, N = 3 13.54 MIN: 13.26 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.02, N = 3 10.54 MIN: 6.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 2 4 6 8 10 SE +/- 0.02918, N = 3 8.33697 MIN: 8.01 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.20, N = 4 12.81 MIN: 12.25 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.5882 1.1764 1.7646 2.3528 2.941 SE +/- 0.00540, N = 3 2.61408 MIN: 2.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.9837 1.9674 2.9511 3.9348 4.9185 SE +/- 0.00941, N = 3 4.37194 MIN: 4.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 900 1800 2700 3600 4500 SE +/- 3.18, N = 3 4007.49 MIN: 3971.88 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 500 1000 1500 2000 2500 SE +/- 1.73, N = 3 2107.13 MIN: 2086.2 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 900 1800 2700 3600 4500 SE +/- 5.89, N = 3 3999.33 MIN: 3962.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 500 1000 1500 2000 2500 SE +/- 27.02, N = 3 2135.24 MIN: 2083.3 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.5077 1.0154 1.5231 2.0308 2.5385 SE +/- 0.00794, N = 3 2.25636 MIN: 2.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU NVIDIA GeForce RTX 3090 900 1800 2700 3600 4500 SE +/- 3.64, N = 3 4001.99 MIN: 3964.95 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU NVIDIA GeForce RTX 3090 500 1000 1500 2000 2500 SE +/- 2.08, N = 3 2109.64 MIN: 2084.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3090 0.796 1.592 2.388 3.184 3.98 SE +/- 0.00554, N = 3 3.53795 MIN: 3.28 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Mobile Neural Network MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 NVIDIA GeForce RTX 3090 0.9322 1.8644 2.7966 3.7288 4.661 SE +/- 0.044, N = 3 4.143 MIN: 3.94 / MAX: 60.31 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 NVIDIA GeForce RTX 3090 6 12 18 24 30 SE +/- 0.06, N = 3 23.68 MIN: 23.08 / MAX: 54.3 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 NVIDIA GeForce RTX 3090 0.4723 0.9446 1.4169 1.8892 2.3615 SE +/- 0.002, N = 3 2.099 MIN: 2.05 / MAX: 6.64 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 NVIDIA GeForce RTX 3090 0.6152 1.2304 1.8456 2.4608 3.076 SE +/- 0.007, N = 3 2.734 MIN: 2.67 / MAX: 7.63 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 NVIDIA GeForce RTX 3090 7 14 21 28 35 SE +/- 0.05, N = 3 28.19 MIN: 27.64 / MAX: 58.71 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 NVIDIA GeForce RTX 3090 0.828 1.656 2.484 3.312 4.14 SE +/- 0.01, N = 4 3.68 MIN: 3.58 / MAX: 8.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 NVIDIA GeForce RTX 3090 0.6998 1.3996 2.0994 2.7992 3.499 SE +/- 0.01, N = 4 3.11 MIN: 3.06 / MAX: 8.02 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 NVIDIA GeForce RTX 3090 0.99 1.98 2.97 3.96 4.95 SE +/- 0.04, N = 4 4.40 MIN: 4.29 / MAX: 29.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet NVIDIA GeForce RTX 3090 0.7898 1.5796 2.3694 3.1592 3.949 SE +/- 0.04, N = 4 3.51 MIN: 3.32 / MAX: 22.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 NVIDIA GeForce RTX 3090 1.2083 2.4166 3.6249 4.8332 6.0415 SE +/- 0.11, N = 4 5.37 MIN: 5 / MAX: 19.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface NVIDIA GeForce RTX 3090 0.315 0.63 0.945 1.26 1.575 SE +/- 0.03, N = 4 1.40 MIN: 1.33 / MAX: 5.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.01, N = 4 13.13 MIN: 12.48 / MAX: 23.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 NVIDIA GeForce RTX 3090 13 26 39 52 65 SE +/- 0.16, N = 4 55.98 MIN: 54.46 / MAX: 67.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.04, N = 4 15.15 MIN: 14.72 / MAX: 25.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.11, N = 4 13.48 MIN: 12.81 / MAX: 24.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 NVIDIA GeForce RTX 3090 6 12 18 24 30 SE +/- 0.10, N = 4 27.47 MIN: 26.81 / MAX: 52.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny NVIDIA GeForce RTX 3090 5 10 15 20 25 SE +/- 0.16, N = 4 22.15 MIN: 21.24 / MAX: 31.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.02, N = 4 16.70 MIN: 16.4 / MAX: 35.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.08, N = 4 9.73 MIN: 9.45 / MAX: 34.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.08, N = 3 14.18 MIN: 13.89 / MAX: 18.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 NVIDIA GeForce RTX 3090 0.828 1.656 2.484 3.312 4.14 SE +/- 0.01, N = 3 3.68 MIN: 3.6 / MAX: 8.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 NVIDIA GeForce RTX 3090 0.7155 1.431 2.1465 2.862 3.5775 SE +/- 0.04, N = 3 3.18 MIN: 3.03 / MAX: 20.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 NVIDIA GeForce RTX 3090 0.9698 1.9396 2.9094 3.8792 4.849 SE +/- 0.07, N = 3 4.31 MIN: 4.11 / MAX: 22.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet NVIDIA GeForce RTX 3090 0.7718 1.5436 2.3154 3.0872 3.859 SE +/- 0.07, N = 3 3.43 MIN: 3.3 / MAX: 29.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 NVIDIA GeForce RTX 3090 1.1453 2.2906 3.4359 4.5812 5.7265 SE +/- 0.09, N = 3 5.09 MIN: 4.94 / MAX: 9.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface NVIDIA GeForce RTX 3090 0.315 0.63 0.945 1.26 1.575 SE +/- 0.05, N = 3 1.40 MIN: 1.32 / MAX: 3.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.06, N = 3 12.97 MIN: 12.29 / MAX: 24.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 NVIDIA GeForce RTX 3090 13 26 39 52 65 SE +/- 0.32, N = 3 55.85 MIN: 54.73 / MAX: 68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.02, N = 3 15.11 MIN: 14.69 / MAX: 24.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.18, N = 3 13.42 MIN: 12.87 / MAX: 23.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 NVIDIA GeForce RTX 3090 6 12 18 24 30 SE +/- 0.04, N = 3 27.56 MIN: 26.94 / MAX: 58.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny NVIDIA GeForce RTX 3090 5 10 15 20 25 SE +/- 0.02, N = 3 22.63 MIN: 21.64 / MAX: 47.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.07, N = 3 16.93 MIN: 16.52 / MAX: 24.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m NVIDIA GeForce RTX 3090 3 6 9 12 15 SE +/- 0.04, N = 3 9.62 MIN: 9.41 / MAX: 14.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 NVIDIA GeForce RTX 3090 50 100 150 200 250 SE +/- 0.06, N = 3 214.41 MIN: 214.21 / MAX: 217.45 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
DeepSpeech Mozilla DeepSpeech is a speech-to-text engine powered by TensorFlow for machine learning and derived from Baidu's Deep Speech research paper. This test profile times the speech-to-text process for a roughly three minute audio recording. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU NVIDIA GeForce RTX 3090 14 28 42 56 70 SE +/- 0.22, N = 3 63.93
RNNoise RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 NVIDIA GeForce RTX 3090 4 8 12 16 20 SE +/- 0.02, N = 3 15.97 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial timeseries data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: EXPoSE NVIDIA GeForce RTX 3090 70 140 210 280 350 SE +/- 0.35, N = 3 319.33
NVIDIA GeForce RTX 3090 Processor: AMD Ryzen 5 5600X 6-Core @ 3.70GHz (6 Cores / 12 Threads), Motherboard: ASRock X570 Phantom Gaming-ITX/TB3 (P3.00 BIOS), Chipset: AMD Device 1480, Memory: 64GB, Disk: 2000GB Samsung SSD 970 EVO Plus 2TB + 4001GB Samsung SSD 870 + ProductCode, Graphics: NVIDIA GeForce RTX 3090 24GB, Audio: NVIDIA Device 1aef, Monitor: marantz-AVR, Network: Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.4.0-70-generic (x86_64), Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.8, Display Driver: NVIDIA 460.32.03, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 11.2.109, Vulkan: 1.2.155, Compiler: GCC 7.5.0 + CUDA 11.2, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa201009OpenCL Notes: GPU Compute Cores: 10496Python Notes: Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 29 March 2021 23:38 by user brw.