
Как установить SHOC для GPU?
INSTALL SHOC 1.1.5
Подробнее об установке OpenMPI, CUDA.
$ module add openmpi/v4.0.3
$ module add cuda/v10.1
$ git clone https://github.com/vetter/shoc.git
$ cd shoc
$ cat config/conf-test.sh
#!/bin/sh
sh ./configure \
CPPFLAGS="-I/nfs/software/cuda/v10.1/include" \
CUDA_CPPFLAGS="-gencode=arch=compute_70,code=sm_70"
$ sh ./config/conf-test.sh
$ make
$ make install
$ perl tools/driver.pl -cuda -s 4 -d 0
--- Welcome To The SHOC Benchmark Suite version 1.1.5 ---
Hostname: hostname
Platform selection not specified, default to platform #0
Number of available platforms: 1
Number of available devices on platform 0 : 4
Device 0: 'Tesla V100-SXM2-32GB'
Device 1: 'Tesla V100-SXM2-32GB'
Device 2: 'Tesla V100-SXM2-32GB'
Device 3: 'Tesla V100-SXM2-32GB'
Specified 4 device IDs: 0
Using size class: 4
--- Starting Benchmarks ---
Running benchmark BusSpeedDownload
result for bspeed_download: 12.3180 GB/sec
Running benchmark BusSpeedReadback
result for bspeed_readback: 13.1676 GB/sec
Running benchmark MaxFlops
result for maxspflops: 15548.4000 GFLOPS
result for maxdpflops: 7837.9000 GFLOPS
Running benchmark DeviceMemory
result for gmem_readbw: 790.3370 GB/s
result for gmem_readbw_strided: 469.8600 GB/s
result for gmem_writebw: 726.6530 GB/s
result for gmem_writebw_strided: 53.4400 GB/s
result for lmem_readbw: 9527.5400 GB/s
result for lmem_writebw: 10578.7000 GB/s
result for tex_readbw: 1580.6200 GB/sec
Skipping non-cuda benchmark KernelCompile
Skipping non-cuda benchmark QueueDelay
Running benchmark FFT
result for fft_sp: 2299.0700 GFLOPS
result for fft_dp: 1146.0300 GFLOPS
Running benchmark GEMM
result for sgemm_n: 14587.1000 GFlops
result for dgemm_n: 6432.5500 GFlops
Running benchmark MD
result for md_sp_flops: 938.3610 GFLOPS
result for md_dp_flops: 846.9970 GFLOPS
Running benchmark MD5Hash
result for md5hash: 34.5492 GHash/s
Running benchmark Reduction
result for reduction: 303.6210 GB/s
result for reduction_dp: 513.4440 GB/s
Running benchmark Scan
result for scan: 174.1060 GB/s
result for scan_dp: 175.7370 GB/s
Running benchmark Sort
result for sort: 20.0892 GB/s
Running benchmark Spmv
result for spmv_csr_scalar_sp: 63.3591 Gflop/s
result for spmv_csr_vector_sp: 151.2000 Gflop/s
result for spmv_ellpackr_sp: 80.3836 Gflop/s
result for spmv_csr_scalar_dp: 45.5877 Gflop/s
result for spmv_csr_vector_dp: 111.4730 Gflop/s
result for spmv_ellpackr_dp: 65.9179 Gflop/s
Running benchmark Stencil2D
result for stencil: 643.0910 GFLOPS
result for stencil_dp: 522.7100 GFLOPS
Running benchmark Triad
result for triad_bw: 16.3517 GB/s
Running benchmark S3D
result for s3d: 428.6150 GFLOPS
result for s3d_dp: 225.6930 GFLOPS
Running benchmark QTC
result for qtc: 5.5839 s
result for qtc_kernel: 4.8583 s
1-2. Подключим нужные модули
3-4. Скачаем SHOC и перейдем в директорию
5. Создадим конфигурационный файл, где укажем версию Compute Capability для GPU
10-12. Установим SHOC
13. Запустим SHOC
Вконтакте
Facebook
Twitter
Класснуть
Плюсануть
