sys76-kudu-ml-nvidia AMD Ryzen 9 5900HX testing with a System76 Kudu (1.07.09RSA1 BIOS) and NVIDIA GeForce RTX 3060 Laptop GPU 6GB on Pop 21.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2202175-NE-SYS76KUDU28 .
sys76-kudu-ml-nvidia Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution NVIDIA GeForce RTX 3060 Laptop GPU AMD Ryzen 9 5900HX @ 3.30GHz (8 Cores / 16 Threads) System76 Kudu (1.07.09RSA1 BIOS) AMD Renoir/Cezanne 16GB Samsung SSD 970 EVO Plus 500GB NVIDIA GeForce RTX 3060 Laptop GPU 6GB NVIDIA Device 228e Realtek RTL8125 2.5GbE + Intel Wi-Fi 6 AX200 Pop 21.10 5.15.15-76051515-generic (x86_64) GNOME Shell 40.5 X Server 1.20.13 NVIDIA 470.86 4.6.0 OpenCL 3.0 CUDA 11.4.158 1.2.182 GCC 11.2.0 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa50000c - GLAMOR - BAR1 / Visible vRAM Size: 8192 MiB - GPU Compute Cores: 3840 - Python 3.9.7 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
sys76-kudu-ml-nvidia shoc: OpenCL - S3D shoc: OpenCL - Triad shoc: OpenCL - FFT SP shoc: OpenCL - MD5 Hash shoc: OpenCL - Reduction shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Max SP Flops shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth lczero: BLAS onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU numpy: deepspeech: CPU rbenchmark: rnnoise: tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 caffe: AlexNet - CPU - 100 caffe: AlexNet - CPU - 200 caffe: AlexNet - CPU - 1000 caffe: GoogleNet - CPU - 100 caffe: GoogleNet - CPU - 200 caffe: GoogleNet - CPU - 1000 mnn: mobilenetV3 mnn: squeezenetv1.1 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m tnn: CPU - DenseNet tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v2 tnn: CPU - SqueezeNet v1.1 plaidml: No - Inference - VGG16 - CPU plaidml: No - Inference - ResNet 50 - CPU ecp-candle: P1B2 ecp-candle: P3B1 ecp-candle: P3B2 mlpack: scikit_ica mlpack: scikit_qda mlpack: scikit_svm mlpack: scikit_linearridgeregression opencv: DNN - Deep Neural Network NVIDIA GeForce RTX 3060 Laptop GPU 166.282 6.5530 877.307 16.2706 295.567 2765.33 15066.7 6.6926 6.7648 1309.85 565 4.30842 11.2735 1.63127 2.47936 22.5331 8.34154 6.71586 23.3045 2.12066 3.18077 3570.14 2135.56 3566.36 2164.93 4.60966 3558.73 2164.42 3.00110 432.86 68.95180 0.1261 16.110 190603 2758593 151814 127871 141345 2486377 33278 65872 325342 86800 174373 866136 1.199 2.773 22.606 4.490 2.385 2.411 31.923 15.69 4.00 3.48 2.74 3.24 5.26 1.19 13.44 70.90 15.92 14.63 25.27 25.11 18.56 6.85 10.6 4.02 4.66 2.88 3.96 10.03 1.34 9.01 44.99 6.51 6.73 13.36 19.01 19.63 5.79 2721.407 250.129 54.863 222.651 13.26 7.00 35.184 1473.428 716.514 48.31 66.04 17.59 2.09 28019 OpenBenchmarking.org
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D NVIDIA GeForce RTX 3060 Laptop GPU 40 80 120 160 200 SE +/- 0.03, N = 3 166.28 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.0003, N = 3 6.5530 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP NVIDIA GeForce RTX 3060 Laptop GPU 200 400 600 800 1000 SE +/- 0.02, N = 3 877.31 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash NVIDIA GeForce RTX 3060 Laptop GPU 4 8 12 16 20 SE +/- 0.02, N = 3 16.27 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction NVIDIA GeForce RTX 3060 Laptop GPU 60 120 180 240 300 SE +/- 0.07, N = 3 295.57 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N NVIDIA GeForce RTX 3060 Laptop GPU 600 1200 1800 2400 3000 SE +/- 14.73, N = 3 2765.33 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops NVIDIA GeForce RTX 3060 Laptop GPU 3K 6K 9K 12K 15K SE +/- 19.26, N = 3 15066.7 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.0004, N = 3 6.6926 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.0000, N = 3 6.7648 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth NVIDIA GeForce RTX 3060 Laptop GPU 300 600 900 1200 1500 SE +/- 2.56, N = 3 1309.85 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS NVIDIA GeForce RTX 3060 Laptop GPU 120 240 360 480 600 SE +/- 6.66, N = 3 565 1. (CXX) g++ options: -flto -pthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 0.9694 1.9388 2.9082 3.8776 4.847 SE +/- 0.05326, N = 4 4.30842 MIN: 3.94 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 3 6 9 12 15 SE +/- 0.01, N = 3 11.27 MIN: 10.74 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 0.367 0.734 1.101 1.468 1.835 SE +/- 0.00737, N = 3 1.63127 MIN: 1.49 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 0.5579 1.1158 1.6737 2.2316 2.7895 SE +/- 0.03298, N = 15 2.47936 MIN: 2.32 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 5 10 15 20 25 SE +/- 0.04, N = 3 22.53 MIN: 21.51 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.01115, N = 3 8.34154 MIN: 4.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.01705, N = 3 6.71586 MIN: 6.54 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 6 12 18 24 30 SE +/- 0.05, N = 3 23.30 MIN: 22.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 0.4771 0.9542 1.4313 1.9084 2.3855 SE +/- 0.00175, N = 3 2.12066 MIN: 1.95 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 0.7157 1.4314 2.1471 2.8628 3.5785 SE +/- 0.03473, N = 3 3.18077 MIN: 2.72 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 800 1600 2400 3200 4000 SE +/- 11.16, N = 3 3570.14 MIN: 3520.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 500 1000 1500 2000 2500 SE +/- 5.46, N = 3 2135.56 MIN: 2104.01 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 800 1600 2400 3200 4000 SE +/- 5.55, N = 3 3566.36 MIN: 3515.62 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 500 1000 1500 2000 2500 SE +/- 1.37, N = 3 2164.93 MIN: 2138.25 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 1.0372 2.0744 3.1116 4.1488 5.186 SE +/- 0.00366, N = 3 4.60966 MIN: 4.42 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 800 1600 2400 3200 4000 SE +/- 6.65, N = 3 3558.73 MIN: 3513.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 500 1000 1500 2000 2500 SE +/- 15.52, N = 3 2164.42 MIN: 2113.31 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU NVIDIA GeForce RTX 3060 Laptop GPU 0.6752 1.3504 2.0256 2.7008 3.376 SE +/- 0.00612, N = 3 3.00110 MIN: 2.77 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark NVIDIA GeForce RTX 3060 Laptop GPU 90 180 270 360 450 SE +/- 0.83, N = 3 432.86
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU NVIDIA GeForce RTX 3060 Laptop GPU 15 30 45 60 75 SE +/- 0.10, N = 3 68.95
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark NVIDIA GeForce RTX 3060 Laptop GPU 0.0284 0.0568 0.0852 0.1136 0.142 SE +/- 0.0004, N = 3 0.1261 1. R scripting front-end version 4.0.4 (2021-02-15)
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 NVIDIA GeForce RTX 3060 Laptop GPU 4 8 12 16 20 SE +/- 0.02, N = 3 16.11 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet NVIDIA GeForce RTX 3060 Laptop GPU 40K 80K 120K 160K 200K SE +/- 32.83, N = 3 190603
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 NVIDIA GeForce RTX 3060 Laptop GPU 600K 1200K 1800K 2400K 3000K SE +/- 636.98, N = 3 2758593
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile NVIDIA GeForce RTX 3060 Laptop GPU 30K 60K 90K 120K 150K SE +/- 523.89, N = 3 151814
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float NVIDIA GeForce RTX 3060 Laptop GPU 30K 60K 90K 120K 150K SE +/- 44.24, N = 3 127871
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant NVIDIA GeForce RTX 3060 Laptop GPU 30K 60K 90K 120K 150K SE +/- 185.90, N = 3 141345
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 NVIDIA GeForce RTX 3060 Laptop GPU 500K 1000K 1500K 2000K 2500K SE +/- 1929.87, N = 3 2486377
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 NVIDIA GeForce RTX 3060 Laptop GPU 7K 14K 21K 28K 35K SE +/- 36.37, N = 3 33278 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 NVIDIA GeForce RTX 3060 Laptop GPU 14K 28K 42K 56K 70K SE +/- 77.42, N = 3 65872 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 1000 NVIDIA GeForce RTX 3060 Laptop GPU 70K 140K 210K 280K 350K SE +/- 901.36, N = 3 325342 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 NVIDIA GeForce RTX 3060 Laptop GPU 20K 40K 60K 80K 100K SE +/- 83.58, N = 3 86800 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 NVIDIA GeForce RTX 3060 Laptop GPU 40K 80K 120K 160K 200K SE +/- 519.84, N = 3 174373 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 1000 NVIDIA GeForce RTX 3060 Laptop GPU 200K 400K 600K 800K 1000K SE +/- 360.03, N = 3 866136 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenetV3 NVIDIA GeForce RTX 3060 Laptop GPU 0.2698 0.5396 0.8094 1.0792 1.349 SE +/- 0.004, N = 3 1.199 MIN: 1.14 / MAX: 9.8 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: squeezenetv1.1 NVIDIA GeForce RTX 3060 Laptop GPU 0.6239 1.2478 1.8717 2.4956 3.1195 SE +/- 0.014, N = 3 2.773 MIN: 2.57 / MAX: 11.72 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: resnet-v2-50 NVIDIA GeForce RTX 3060 Laptop GPU 5 10 15 20 25 SE +/- 0.23, N = 3 22.61 MIN: 21.44 / MAX: 48.44 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: SqueezeNetV1.0 NVIDIA GeForce RTX 3060 Laptop GPU 1.0103 2.0206 3.0309 4.0412 5.0515 SE +/- 0.077, N = 3 4.490 MIN: 4.31 / MAX: 10.28 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: MobileNetV2_224 NVIDIA GeForce RTX 3060 Laptop GPU 0.5366 1.0732 1.6098 2.1464 2.683 SE +/- 0.012, N = 3 2.385 MIN: 2.24 / MAX: 20.06 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenet-v1-1.0 NVIDIA GeForce RTX 3060 Laptop GPU 0.5425 1.085 1.6275 2.17 2.7125 SE +/- 0.039, N = 3 2.411 MIN: 2.17 / MAX: 19.27 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: inception-v3 NVIDIA GeForce RTX 3060 Laptop GPU 7 14 21 28 35 SE +/- 0.30, N = 3 31.92 MIN: 29.4 / MAX: 49.5 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mobilenet NVIDIA GeForce RTX 3060 Laptop GPU 4 8 12 16 20 SE +/- 0.22, N = 3 15.69 MIN: 14.93 / MAX: 22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 NVIDIA GeForce RTX 3060 Laptop GPU 0.9 1.8 2.7 3.6 4.5 SE +/- 0.01, N = 3 4.00 MIN: 3.73 / MAX: 9.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 NVIDIA GeForce RTX 3060 Laptop GPU 0.783 1.566 2.349 3.132 3.915 SE +/- 0.01, N = 3 3.48 MIN: 3.18 / MAX: 9.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: shufflenet-v2 NVIDIA GeForce RTX 3060 Laptop GPU 0.6165 1.233 1.8495 2.466 3.0825 SE +/- 0.02, N = 3 2.74 MIN: 2.46 / MAX: 14.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mnasnet NVIDIA GeForce RTX 3060 Laptop GPU 0.729 1.458 2.187 2.916 3.645 SE +/- 0.01, N = 3 3.24 MIN: 2.91 / MAX: 8.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: efficientnet-b0 NVIDIA GeForce RTX 3060 Laptop GPU 1.1835 2.367 3.5505 4.734 5.9175 SE +/- 0.01, N = 3 5.26 MIN: 4.89 / MAX: 15.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: blazeface NVIDIA GeForce RTX 3060 Laptop GPU 0.2678 0.5356 0.8034 1.0712 1.339 SE +/- 0.01, N = 3 1.19 MIN: 1.15 / MAX: 6.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: googlenet NVIDIA GeForce RTX 3060 Laptop GPU 3 6 9 12 15 SE +/- 0.37, N = 3 13.44 MIN: 12.49 / MAX: 28.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: vgg16 NVIDIA GeForce RTX 3060 Laptop GPU 16 32 48 64 80 SE +/- 0.04, N = 3 70.90 MIN: 69.81 / MAX: 84.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet18 NVIDIA GeForce RTX 3060 Laptop GPU 4 8 12 16 20 SE +/- 0.44, N = 3 15.92 MIN: 14.75 / MAX: 26.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: alexnet NVIDIA GeForce RTX 3060 Laptop GPU 4 8 12 16 20 SE +/- 0.02, N = 3 14.63 MIN: 14.11 / MAX: 22.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet50 NVIDIA GeForce RTX 3060 Laptop GPU 6 12 18 24 30 SE +/- 0.08, N = 3 25.27 MIN: 24.15 / MAX: 39.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: yolov4-tiny NVIDIA GeForce RTX 3060 Laptop GPU 6 12 18 24 30 SE +/- 0.08, N = 3 25.11 MIN: 24.08 / MAX: 47.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: squeezenet_ssd NVIDIA GeForce RTX 3060 Laptop GPU 5 10 15 20 25 SE +/- 0.06, N = 3 18.56 MIN: 17.91 / MAX: 26.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: regnety_400m NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.02, N = 3 6.85 MIN: 6.35 / MAX: 16.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: mobilenet NVIDIA GeForce RTX 3060 Laptop GPU 3 6 9 12 15 SE +/- 0.09, N = 3 10.6 MIN: 9.57 / MAX: 13.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 NVIDIA GeForce RTX 3060 Laptop GPU 0.9045 1.809 2.7135 3.618 4.5225 SE +/- 0.09, N = 3 4.02 MIN: 3.54 / MAX: 6.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 NVIDIA GeForce RTX 3060 Laptop GPU 1.0485 2.097 3.1455 4.194 5.2425 SE +/- 0.07, N = 3 4.66 MIN: 4.22 / MAX: 8.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: shufflenet-v2 NVIDIA GeForce RTX 3060 Laptop GPU 0.648 1.296 1.944 2.592 3.24 SE +/- 0.01, N = 3 2.88 MIN: 2.33 / MAX: 4.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: mnasnet NVIDIA GeForce RTX 3060 Laptop GPU 0.891 1.782 2.673 3.564 4.455 SE +/- 0.04, N = 3 3.96 MIN: 3.55 / MAX: 7.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: efficientnet-b0 NVIDIA GeForce RTX 3060 Laptop GPU 3 6 9 12 15 SE +/- 0.10, N = 3 10.03 MIN: 9.08 / MAX: 16.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: blazeface NVIDIA GeForce RTX 3060 Laptop GPU 0.3015 0.603 0.9045 1.206 1.5075 SE +/- 0.03, N = 3 1.34 MIN: 1.18 / MAX: 2.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: googlenet NVIDIA GeForce RTX 3060 Laptop GPU 3 6 9 12 15 SE +/- 0.01, N = 3 9.01 MIN: 8.1 / MAX: 13.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: vgg16 NVIDIA GeForce RTX 3060 Laptop GPU 10 20 30 40 50 SE +/- 0.01, N = 3 44.99 MIN: 44.03 / MAX: 51.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: resnet18 NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.21, N = 3 6.51 MIN: 5.75 / MAX: 13.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: alexnet NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.20, N = 3 6.73 MIN: 6.02 / MAX: 10.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: resnet50 NVIDIA GeForce RTX 3060 Laptop GPU 3 6 9 12 15 SE +/- 0.05, N = 3 13.36 MIN: 12.46 / MAX: 19.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: yolov4-tiny NVIDIA GeForce RTX 3060 Laptop GPU 5 10 15 20 25 SE +/- 0.43, N = 3 19.01 MIN: 17.19 / MAX: 30.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: squeezenet_ssd NVIDIA GeForce RTX 3060 Laptop GPU 5 10 15 20 25 SE +/- 4.25, N = 3 19.63 MIN: 13.92 / MAX: 38.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: regnety_400m NVIDIA GeForce RTX 3060 Laptop GPU 1.3028 2.6056 3.9084 5.2112 6.514 SE +/- 0.14, N = 3 5.79 MIN: 4.79 / MAX: 9.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet NVIDIA GeForce RTX 3060 Laptop GPU 600 1200 1800 2400 3000 SE +/- 2.86, N = 3 2721.41 MIN: 2675.12 / MAX: 2805.37 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 NVIDIA GeForce RTX 3060 Laptop GPU 50 100 150 200 250 SE +/- 0.45, N = 3 250.13 MIN: 247.81 / MAX: 257.68 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 NVIDIA GeForce RTX 3060 Laptop GPU 12 24 36 48 60 SE +/- 0.15, N = 3 54.86 MIN: 54.44 / MAX: 55.55 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 NVIDIA GeForce RTX 3060 Laptop GPU 50 100 150 200 250 SE +/- 0.12, N = 3 222.65 MIN: 221.97 / MAX: 223.85 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU NVIDIA GeForce RTX 3060 Laptop GPU 3 6 9 12 15 SE +/- 0.11, N = 12 13.26
PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU NVIDIA GeForce RTX 3060 Laptop GPU 2 4 6 8 10 SE +/- 0.02, N = 3 7.00
ECP-CANDLE Benchmark: P1B2 OpenBenchmarking.org Seconds, Fewer Is Better ECP-CANDLE 0.4 Benchmark: P1B2 NVIDIA GeForce RTX 3060 Laptop GPU 8 16 24 32 40 35.18
ECP-CANDLE Benchmark: P3B1 OpenBenchmarking.org Seconds, Fewer Is Better ECP-CANDLE 0.4 Benchmark: P3B1 NVIDIA GeForce RTX 3060 Laptop GPU 300 600 900 1200 1500 1473.43
ECP-CANDLE Benchmark: P3B2 OpenBenchmarking.org Seconds, Fewer Is Better ECP-CANDLE 0.4 Benchmark: P3B2 NVIDIA GeForce RTX 3060 Laptop GPU 150 300 450 600 750 716.51
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica NVIDIA GeForce RTX 3060 Laptop GPU 11 22 33 44 55 SE +/- 0.07, N = 3 48.31
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda NVIDIA GeForce RTX 3060 Laptop GPU 15 30 45 60 75 SE +/- 0.17, N = 3 66.04
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm NVIDIA GeForce RTX 3060 Laptop GPU 4 8 12 16 20 SE +/- 0.04, N = 3 17.59
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression NVIDIA GeForce RTX 3060 Laptop GPU 0.4703 0.9406 1.4109 1.8812 2.3515 SE +/- 0.01, N = 3 2.09
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.5.4 Test: DNN - Deep Neural Network NVIDIA GeForce RTX 3060 Laptop GPU 6K 12K 18K 24K 30K SE +/- 568.38, N = 15 28019 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
Phoronix Test Suite v10.8.5