Benchmarks for a future article on Phoronix looking at HBv4 Genoa-X Linux performance..
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2308011-PTS-AZUREHBV71
Microsoft Azure HBv4 HPC Performance Benchmarks
Benchmarks for a future article on Phoronix looking at HBv4 Genoa-X Linux performance..
,,"HC","HBv2","HBv3","HBv4"
Processor,,2 x Intel Xeon Platinum 8168 (44 Cores),2 x AMD EPYC 7V12 64-Core (120 Cores),2 x AMD EPYC 7V73X 64-Core (120 Cores),2 x AMD EPYC 9V33X 96-Core (176 Cores)
Motherboard,,Microsoft Virtual Machine (Hyper-V UEFI v4.1 BIOS),Microsoft Virtual Machine (Hyper-V UEFI v4.1 BIOS),Microsoft Virtual Machine (Hyper-V UEFI v4.1 BIOS),Microsoft Virtual Machine (Hyper-V UEFI v4.1 BIOS)
Memory,,1 GB + 60928 MB + 118272 MB + 176 GB,1 GB + 59 GB + 54 GB + 114 GB + 114 GB + 114 GB,1 GB + 59 GB + 54 GB + 114 GB + 114 GB + 114 GB,1 GB + 59 GB + 116 GB + 176 GB + 176 GB + 176 GB
Disk,,32GB Virtual Disk + 752GB Virtual Disk,960GB Microsoft NVMe Direct Disk + 32GB Virtual Disk + 515GB Virtual Disk,2 x 960GB Microsoft NVMe Direct Disk + 32GB Virtual Disk + 515GB Virtual Disk,2 x 1920GB Microsoft NVMe Direct Disk + 32GB Virtual Disk + 515GB Virtual Disk
Graphics,,hyperv_fb,hyperv_fb,hyperv_fb,hyperv_fb
OS,,AlmaLinux 8.8,AlmaLinux 8.8,AlmaLinux 8.8,AlmaLinux 8.8
Kernel,,4.18.0-425.3.1.el8.x86_64 (x86_64),4.18.0-425.3.1.el8.x86_64 (x86_64),4.18.0-425.3.1.el8.x86_64 (x86_64),4.18.0-425.3.1.el8.x86_64 (x86_64)
Compiler,,GCC 13.1.0 + CUDA 12.1,GCC 13.1.0 + CUDA 12.1,GCC 13.1.0 + CUDA 12.1,GCC 13.1.0 + CUDA 12.1
File-System,,nfs,nfs,nfs,nfs
Screen Resolution,,1024x768,1024x768,1024x768,1024x768
System Layer,,microsoft,microsoft,microsoft,microsoft
,,"HC","HBv2","HBv3","HBv4"
"High Performance Conjugate Gradient - X Y Z: 104 104 104 - RT: 60 (GFLOP/s)",HIB,25.9971,37.0410,39.6093,89.3840
"High Performance Conjugate Gradient - X Y Z: 144 144 144 - RT: 60 (GFLOP/s)",HIB,25.8659,36.0866,38.9739,88.5160
"High Performance Conjugate Gradient - X Y Z: 160 160 160 - RT: 60 (GFLOP/s)",HIB,25.5635,36.0167,39.1106,87.9013
"NAS Parallel Benchmarks - Test / Class: BT.C (Mop/s)",HIB,106230.52,241509.88,313813.98,744413.90
"NAS Parallel Benchmarks - Test / Class: CG.C (Mop/s)",HIB,27619.05,36367.35,36681.43,74101.94
"NAS Parallel Benchmarks - Test / Class: FT.C (Mop/s)",HIB,55288.19,98485.23,102122.36,230164.79
"NAS Parallel Benchmarks - Test / Class: IS.D (Mop/s)",HIB,1864.68,3977.02,5730.01,12967.37
"NAS Parallel Benchmarks - Test / Class: MG.C (Mop/s)",HIB,63404.01,108985.72,131635.41,437417.16
"NAS Parallel Benchmarks - Test / Class: SP.C (Mop/s)",HIB,41543.94,104771.90,205795.59,427298.99
"NAMD - ATPase Simulation - 327,506 Atoms (days/ns)",LIB,0.52697,0.26505,0.27111,0.14380
"libxsmm - M N K: 128 (GFLOPS/s)",HIB,1284.8,1011.4,2273.5,6655.2
"libxsmm - M N K: 256 (GFLOPS/s)",HIB,904.1,1128.3,2045.7,6908.6
"libxsmm - M N K: 32 (GFLOPS/s)",HIB,384.9,164.8,1438.1,6163.0
"libxsmm - M N K: 64 (GFLOPS/s)",HIB,748.1,331.4,2413.7,5898.2
"Laghos - Test: Triple Point Problem (Major Kernels Rate)",HIB,156.52,183.82,192.74,228.15
"Laghos - Test: Sedov Blast Wave, ube_922_hex.mesh (Major Kernels Rate)",HIB,247.49,345.14,361.81,402.94
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 (GFLOP/s)",HIB,58.3567,91.5383,103.5147,256.349
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 (GFLOP/s)",HIB,62.9750,95.8801,135.694,355.855
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 (GFLOP/s)",HIB,114.025,191.775,254.252,622.580
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 (GFLOP/s)",HIB,33.5193,47.6050,57.3307,159.175
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 (GFLOP/s)",HIB,59.7292,91.2601,103.409,244.342
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 (GFLOP/s)",HIB,57.7643,93.7923,123.242,323.356
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 (GFLOP/s)",HIB,57.3101,91.9186,103.2457,261.903
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 (GFLOP/s)",HIB,60.8804,91.4802,121.283,314.336
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 (GFLOP/s)",HIB,110.049,190.949,232.166,596.226
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 (GFLOP/s)",HIB,31.5718,46.9794,56.2161,154.648
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 (GFLOP/s)",HIB,60.5727,93.3137,102.7046,264.954
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 (GFLOP/s)",HIB,59.8216,94.5301,117.731,311.803
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 (GFLOP/s)",HIB,58.5498,90.7883,105.093,255.968
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 (GFLOP/s)",HIB,62.9027,96.4941,135.950,355.512
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 (GFLOP/s)",HIB,122.772,200.035,221.861,427.101
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 (GFLOP/s)",HIB,113.940,191.141,257.419,624.951
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 (GFLOP/s)",HIB,33.5545,47.3696,57.2263,159.258
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 (GFLOP/s)",HIB,59.5527,92.1290,105.361,247.725
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 (GFLOP/s)",HIB,57.9203,93.2573,124.595,323.696
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 (GFLOP/s)",HIB,57.1290,88.6081,106.632,273.121
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 (GFLOP/s)",HIB,60.8204,91.4296,120.957,315.982
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 (GFLOP/s)",HIB,110.197,189.208,233.797,590.925
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 (GFLOP/s)",HIB,31.5846,46.9289,56.2690,154.568
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 (GFLOP/s)",HIB,60.8872,92.3883,105.5003,258.716
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 (GFLOP/s)",HIB,59.8954,95.1989,118.236,311.267
"Pennant - Test: sedovbig (Hydro Cycle Time - sec)",LIB,25.01956,5.915805,6.277107,3.581391
"Pennant - Test: leblancbig (Hydro Cycle Time - sec)",LIB,10.64548,3.466885,3.649317,2.122074
"ACES DGEMM - Sustained Floating-Point Rate (GFLOP/s)",HIB,14.072027,6.395415,25.048352,52.802440
"Intel Open Image Denoise - Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only (Images / Sec)",HIB,1.85,2.03,1.72,3.11
"Intel Open Image Denoise - Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only (Images / Sec)",HIB,1.85,2.01,1.69,3.08
"Intel Open Image Denoise - Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only (Images / Sec)",HIB,0.87,0.96,0.80,1.32
"OSPRay - Benchmark: particle_volume/ao/real_time (Items/sec)",HIB,8.99618,22.3668,24.4710,36.6548
"OSPRay - Benchmark: particle_volume/scivis/real_time (Items/sec)",HIB,8.87831,22.1747,24.2197,36.5446
"OSPRay - Benchmark: particle_volume/pathtracer/real_time (Items/sec)",HIB,96.7630,162.449,167.504,208.050
"OSPRay - Benchmark: gravity_spheres_volume/dim_512/ao/real_time (Items/sec)",HIB,9.52293,8.66888,11.7501,38.0769
"OSPRay - Benchmark: gravity_spheres_volume/dim_512/scivis/real_time (Items/sec)",HIB,9.02689,8.32323,11.1723,37.0624
"OSPRay - Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time (Items/sec)",HIB,10.0611,13.9416,14.6088,32.5839
"7-Zip Compression - Test: Compression Rating (MIPS)",HIB,216451,501534,566595,1083523
"7-Zip Compression - Test: Decompression Rating (MIPS)",HIB,150841,388577,406516,742859
"Timed Node.js Compilation - Time To Compile (sec)",LIB,330.613,194.367,185.567,150.558
"oneDNN - Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU (ms)",LIB,707.322,1367.73,886.810,533.494
"oneDNN - Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU (ms)",LIB,442.471,910.937,529.973,411.234
"Liquid-DSP - Threads: 128 - Buffer Length: 256 - Filter Length: 57 (samples/s)",HIB,1570633333,4309133333,4216966667,5412900000
"Liquid-DSP - Threads: 176 - Buffer Length: 256 - Filter Length: 32 (samples/s)",HIB,1536633333,4275533333,3864000000,6181766667
"Liquid-DSP - Threads: 176 - Buffer Length: 256 - Filter Length: 57 (samples/s)",HIB,1683033333,4350100000,4281533333,7095033333
"Liquid-DSP - Threads: 176 - Buffer Length: 256 - Filter Length: 512 (samples/s)",HIB,544626667,924243333,814950000,2221966667
"PostgreSQL - Scaling Factor: 1 - Clients: 500 - Mode: Read Only (TPS)",HIB,1353510,2467328,2434749,3161848
"PostgreSQL - Scaling Factor: 1 - Clients: 500 - Mode: Read Only - Average Latency (ms)",LIB,0.369,0.203,0.206,0.158
"PostgreSQL - Scaling Factor: 1 - Clients: 800 - Mode: Read Only (TPS)",HIB,1159492,2481320,2478917,3146173
"PostgreSQL - Scaling Factor: 1 - Clients: 800 - Mode: Read Only - Average Latency (ms)",LIB,0.690,0.323,0.323,0.254
"Blender - Blend File: BMW27 - Compute: CPU-Only (sec)",LIB,49.95,19.58,19.43,10.11
"Blender - Blend File: Classroom - Compute: CPU-Only (sec)",LIB,138.51,50.95,50.71,25.61
"Blender - Blend File: Fishy Cat - Compute: CPU-Only (sec)",LIB,71.76,26.43,25.59,13.74
"Blender - Blend File: Barbershop - Compute: CPU-Only (sec)",LIB,526.93,211.46,188.96,97.52
"Blender - Blend File: Pabellon Barcelona - Compute: CPU-Only (sec)",LIB,175.07,64.84,62.90,33.01
"PETSc - Test: Streams (MB/s)",HIB,151286.2491,197895.4717,284001.9162,598417.6957