test_002 AMD Ryzen 9 7900X3D 12-Core testing with a ASUS ProArt B650-CREATOR (2007 BIOS) and ASUS NVIDIA GeForce RTX 3070 8GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410181-NE-TEST0022402&grs .
test_002 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution test_002 AMD Ryzen 9 7900X3D 12-Core @ 5.66GHz (12 Cores / 24 Threads) ASUS ProArt B650-CREATOR (2007 BIOS) AMD Device 14d8 2 x 32 GB DDR5-4800MT/s Kingston KF556C36-32 2000GB Samsung SSD 990 PRO 2TB ASUS NVIDIA GeForce RTX 3070 8GB NVIDIA GA104 HD Audio HP E242 Realtek RTL8111/8168/8211/8411 + Realtek RTL8125 2.5GbE Ubuntu 24.04 6.8.0-47-generic (x86_64) GNOME Shell 46.0 X Server 1.21.1.11 NVIDIA 560.35.03 GCC 13.2.0 + CUDA 12.6 ext4 1920x1200 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601206 - GLAMOR - BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.3a.40.2d - GPU Compute Cores: 5888 - Python 3.12.3 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
test_002 neatbench: GPU indigobench: OpenCL GPU - Supercar indigobench: OpenCL GPU - Bedroom blender: Pabellon Barcelona - NVIDIA OptiX blender: Barbershop - NVIDIA OptiX blender: Fishy Cat - NVIDIA OptiX blender: Classroom - NVIDIA OptiX blender: Junkshop - NVIDIA OptiX blender: BMW27 - NVIDIA OptiX ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mobilenet gromacs: NVIDIA CUDA GPU - water_GMX50_bare viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY luxcorerender: Rainbow Colors and Prism - GPU luxcorerender: LuxCore Benchmark - GPU luxcorerender: Orange Juice - GPU luxcorerender: Danish Mood - GPU luxcorerender: DLSC - GPU fahbench: octanebench: Total Score betsy: ETC2 RGB - Highest betsy: ETC1 - Highest namd-cuda: ATPase Simulation - 327,506 Atoms mixbench: NVIDIA CUDA - Single Precision mixbench: NVIDIA CUDA - Double Precision mixbench: NVIDIA CUDA - Half Precision mixbench: NVIDIA CUDA - Integer hashcat: TrueCrypt RIPEMD160 + XTS hashcat: SHA-512 hashcat: 7-Zip hashcat: SHA1 hashcat: MD5 waifu2x-ncnn: 2x - 3 - Yes realsr-ncnn: 4x - Yes realsr-ncnn: 4x - No ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY vkpeak: test_002 3070 36.849 13.175 29.06 89.69 18.01 26.28 17.86 10.44 54.22 11.48 18.60 11.97 15.15 11.72 4.84 11.97 15.112 60.7 63.4 56.9 59.5 125 81.6 23.13 7.47 8.40 5.89 8.48 269.6185 399.094682 165.422 166.577 0.09577 21228.42 301.61 22007.25 9956.87 483967 1869466667 655300 12576133333 41067200000 4.661 48.781 8.993 5.50 9.80 5.74 7.46 35.81 2.00 5.97 4.27 4.59 4.58 154 132 119.1 245 381 261 OpenBenchmarking.org
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU test_002 700 1400 2100 2800 3500 SE +/- 0.00, N = 3 3070
IndigoBench Acceleration: OpenCL GPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar test_002 8 16 24 32 40 SE +/- 0.05, N = 3 36.85
IndigoBench Acceleration: OpenCL GPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom test_002 3 6 9 12 15 SE +/- 0.00, N = 3 13.18
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX test_002 7 14 21 28 35 SE +/- 0.01, N = 3 29.06
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Barbershop - Compute: NVIDIA OptiX test_002 20 40 60 80 100 SE +/- 0.05, N = 3 89.69
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Fishy Cat - Compute: NVIDIA OptiX test_002 4 8 12 16 20 SE +/- 0.07, N = 3 18.01
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Classroom - Compute: NVIDIA OptiX test_002 6 12 18 24 30 SE +/- 0.04, N = 3 26.28
Blender Blend File: Junkshop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Junkshop - Compute: NVIDIA OptiX test_002 4 8 12 16 20 SE +/- 0.17, N = 6 17.86
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: BMW27 - Compute: NVIDIA OptiX test_002 3 6 9 12 15 SE +/- 0.07, N = 13 10.44
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer test_002 12 24 36 48 60 SE +/- 0.28, N = 15 54.22 MIN: 40.42 / MAX: 312.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m test_002 3 6 9 12 15 SE +/- 0.16, N = 15 11.48 MIN: 7.77 / MAX: 272.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny test_002 5 10 15 20 25 SE +/- 0.19, N = 15 18.60 MIN: 13.38 / MAX: 316.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 test_002 3 6 9 12 15 SE +/- 0.13, N = 15 11.97 MIN: 8.76 / MAX: 160.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 test_002 4 8 12 16 20 SE +/- 0.19, N = 15 15.15 MIN: 10.79 / MAX: 156.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet test_002 3 6 9 12 15 SE +/- 0.17, N = 15 11.72 MIN: 8 / MAX: 244.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 test_002 1.089 2.178 3.267 4.356 5.445 SE +/- 0.06, N = 15 4.84 MIN: 3.5 / MAX: 147.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet test_002 3 6 9 12 15 SE +/- 0.13, N = 15 11.97 MIN: 8.76 / MAX: 160.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
GROMACS Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare test_002 4 8 12 16 20 SE +/- 0.02, N = 3 15.11 1. (CXX) g++ options: -O3 -lm
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT test_002 14 28 42 56 70 SE +/- 0.27, N = 15 60.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN test_002 14 28 42 56 70 SE +/- 0.35, N = 15 63.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT test_002 13 26 39 52 65 SE +/- 0.13, N = 15 56.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN test_002 13 26 39 52 65 SE +/- 0.19, N = 15 59.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY test_002 30 60 90 120 150 SE +/- 1.26, N = 15 125 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY test_002 20 40 60 80 100 SE +/- 1.23, N = 15 81.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
LuxCoreRender Scene: Rainbow Colors and Prism - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU test_002 6 12 18 24 30 SE +/- 0.02, N = 3 23.13 MIN: 21.09 / MAX: 24.5
LuxCoreRender Scene: LuxCore Benchmark - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU test_002 2 4 6 8 10 SE +/- 0.02, N = 3 7.47 MIN: 2.84 / MAX: 8.69
LuxCoreRender Scene: Orange Juice - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU test_002 2 4 6 8 10 SE +/- 0.05, N = 3 8.40 MIN: 6.8 / MAX: 10.5
LuxCoreRender Scene: Danish Mood - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU test_002 1.3253 2.6506 3.9759 5.3012 6.6265 SE +/- 0.05, N = 3 5.89 MIN: 2.03 / MAX: 7.03
LuxCoreRender Scene: DLSC - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU test_002 2 4 6 8 10 SE +/- 0.00, N = 3 8.48 MIN: 8.35 / MAX: 8.66
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 test_002 60 120 180 240 300 SE +/- 0.08, N = 3 269.62
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 2020.1 Total Score test_002 90 180 270 360 450 399.09
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest test_002 40 80 120 160 200 SE +/- 0.04, N = 3 165.42
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest test_002 40 80 120 160 200 SE +/- 1.14, N = 3 166.58
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms test_002 0.0215 0.043 0.0645 0.086 0.1075 SE +/- 0.00092, N = 3 0.09577
Mixbench Backend: NVIDIA CUDA - Benchmark: Single Precision OpenBenchmarking.org GFLOPS, More Is Better Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision test_002 5K 10K 15K 20K 25K SE +/- 125.04, N = 3 21228.42 1. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
Mixbench Backend: NVIDIA CUDA - Benchmark: Double Precision OpenBenchmarking.org GFLOPS, More Is Better Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision test_002 70 140 210 280 350 SE +/- 1.01, N = 3 301.61 1. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
Mixbench Backend: NVIDIA CUDA - Benchmark: Half Precision OpenBenchmarking.org GFLOPS, More Is Better Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision test_002 5K 10K 15K 20K 25K SE +/- 4.65, N = 3 22007.25 1. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
Mixbench Backend: NVIDIA CUDA - Benchmark: Integer OpenBenchmarking.org GIOPS, More Is Better Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer test_002 2K 4K 6K 8K 10K SE +/- 7.01, N = 3 9956.87 1. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS test_002 100K 200K 300K 400K 500K SE +/- 617.34, N = 3 483967
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA-512 test_002 400M 800M 1200M 1600M 2000M SE +/- 1041366.62, N = 3 1869466667
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: 7-Zip test_002 140K 280K 420K 560K 700K SE +/- 1307.67, N = 3 655300
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA1 test_002 3000M 6000M 9000M 12000M 15000M SE +/- 138456013.87, N = 3 12576133333
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: MD5 test_002 9000M 18000M 27000M 36000M 45000M SE +/- 44800334.82, N = 3 41067200000
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes test_002 1.0487 2.0974 3.1461 4.1948 5.2435 SE +/- 0.004, N = 3 4.661
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes test_002 11 22 33 44 55 SE +/- 0.09, N = 3 48.78
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No test_002 3 6 9 12 15 SE +/- 0.007, N = 3 8.993
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet test_002 1.2375 2.475 3.7125 4.95 6.1875 SE +/- 0.15, N = 15 5.50 MIN: 3.02 / MAX: 153.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd test_002 3 6 9 12 15 SE +/- 0.16, N = 15 9.80 MIN: 6.81 / MAX: 182.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet test_002 1.2915 2.583 3.8745 5.166 6.4575 SE +/- 0.18, N = 15 5.74 MIN: 3.82 / MAX: 213.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 test_002 2 4 6 8 10 SE +/- 0.20, N = 15 7.46 MIN: 5.16 / MAX: 294.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 test_002 8 16 24 32 40 SE +/- 0.59, N = 15 35.81 MIN: 25.17 / MAX: 344.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface test_002 0.45 0.9 1.35 1.8 2.25 SE +/- 0.10, N = 15 2.00 MIN: 1.04 / MAX: 76.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 test_002 1.3433 2.6866 4.0299 5.3732 6.7165 SE +/- 0.24, N = 15 5.97 MIN: 4.13 / MAX: 180.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet test_002 0.9608 1.9216 2.8824 3.8432 4.804 SE +/- 0.07, N = 15 4.27 MIN: 3.16 / MAX: 133.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 test_002 1.0328 2.0656 3.0984 4.1312 5.164 SE +/- 0.16, N = 15 4.59 MIN: 3.43 / MAX: 129.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 test_002 1.0305 2.061 3.0915 4.122 5.1525 SE +/- 0.11, N = 15 4.58 MIN: 3.39 / MAX: 181.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T test_002 30 60 90 120 150 SE +/- 3.34, N = 15 154 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N test_002 30 60 90 120 150 SE +/- 2.64, N = 15 132 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT test_002 30 60 90 120 150 SE +/- 2.04, N = 15 119.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT test_002 50 100 150 200 250 SE +/- 4.92, N = 15 245 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY test_002 80 160 240 320 400 SE +/- 7.10, N = 15 381 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY test_002 60 120 180 240 300 SE +/- 5.00, N = 15 261 1. (CXX) g++ options: -fopenmp -O3 -rdynamic
Phoronix Test Suite v10.8.5