...
Code Block |
---|
ubuntu@gputest:~$ lspci | grep NVIDIA 00:05.0 VGA compatible3D controller: NVIDIA Corporation GV100GL [Tesla V100 PCIe 32GB]Device 20f1 (rev a1) |
You can also verify that a license for the GPU is acquired successfully (yes, we need licences to use our GPUs...):
Code Block |
---|
ubuntu@gputest:~$ journalctl -u nvidia-gridd | tail .... JunJan 3005 0807:3744:1457 gputest nvidia-gridd[1159694]: Acquiring license. for GRID vGPU Edition. Jun 30 08:37:14(Info: http://openstack-nvidia.lisens.ntnu.no:7070/request; NVIDIA Virtual Compute Server) Jan 05 07:44:57 gputest nvidia-gridd[1159694]: Calling load_byte_array(tra) JunJan 3005 0807:3744:1759 gputest nvidia-gridd[1159694]: License acquired successfully. (Info: http://openstack-nvidia.lisens.ntnu.no:7070/request; NVIDIA Quadro-Virtual-DWS,5.0Virtual Compute Server) |
The "nvidia-smi" tool will show you the GPU status
Code Block |
---|
ubuntu@gputest:~$ nvidia-smi TueThu Jan Jun 305 0807:5746:1310 20202023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.130 470.82.01 Driver Version: 418.130 470.82.01 CUDA Version: 1011.14 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |========================== | | MIG M. | |=========+======================+======================+=======| | 0 GRID V100D-8Q On | 00000000:00:05.0 Off |==================| | 0 GRID A100-4C On | 00000000:00:05.0 Off | 0 | | N/A N/A P0 N/A / N/A | 407MiB / 4091MiB | 0% Default | | N/A | | N/A | N/A P0 N/A / N/A | 528MiB / 8192MiB | 0% DefaultDisabled | +-------------------------------+----------------------+----------------------+ +-------------------------------------------------------------------------------------+ | Processes:+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ |
...
Code Block |
---|
ubuntu@gputest:~$ sudo su - root@gputest:~# cd NVIDIA_CUDA-1011.14_Samples/1_Utilities/deviceQuery root@gputest:~/NVIDIA_CUDA-1011.14_Samples/1_Utilities/deviceQuery# make ... lots-of-text-from-make ... root@gputest:~/NVIDIA_CUDA-1011.14_Samples/1_Utilities/deviceQuery# ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "GRID V100DA100-8Q4C" CUDA Driver Version / Runtime Version 1011.14 / 1011.14 CUDA Capability Major/Minor version number: 78.0 Total amount of global memory: 81924092 MBytes (85899345924290641920 bytes) (80108) Multiprocessors, ( 64064) CUDA Cores/MP: 51206912 CUDA Cores GPU Max Clock rate: 13801410 MHz (1.3841 GHz) Memory Clock rate: 8771215 Mhz Memory Bus Width: 40965120-bit L2 Cache Size: 629145641943040 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory:: 65536 bytes Total amount of shared memory per block: 6553649152 bytes Total amount of shared memory per blockmultiprocessor: 49152167936 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 73 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: DisabledEnabled Device supports Unified Addressing (UVA): Yes Device supports Managed Memory: No Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 5 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 1011.14, CUDA Runtime Version = 1011.14, NumDevs = 1 Result = PASS |
...
Code Block |
---|
# Enable the repositories distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list # Install the package sudo apt update && sudo apt -y install nvidia-docker2 # Restart the docker daemon sudo systemctl restart docker # Run a test to verifiy that it works sudo docker run --rm --gpus all nvidia/cuda:11.4.0-base nvidia-smi # Optionally run a test with Tensorflow that actually runs a bit of code on the GPU via docker sudo docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu \ python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))" |
...