You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Ongoing incident

Incident description

The Nvidia GRID license server (nvidiadls02.it.ntnu.no) we use to serve VGPU licenses for GPU-enabled VMs in all of NTNUs Openstack platforms has been reinstalled without anyone telling us. This is a result of missing documentation from NTNU IT's side. Due to the lack of documentation, the engineer thought that the server was not in use.

Impact

New GPU VMs will not be able to retrive a license, and the vGPU will not work. Running VMs will over time lose their license, and will lose it upon a reboot.


Event log

TimeEvent
15.03.24The server was reinstalled by NTNU IT
19.03.24 - 13:49We discovered that new GPU VMs was no longer able to aquire a license - and a few minutes later it became obvious that the server had been reinstalled
19.03.24 - 14:06The engineer in question was contacted, as he was involved in setting this up in June last year. Admits that he has indeed reinstalled this server.


  • No labels