Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Fixing GPU node after unscheduled downtime

Info
titleThis procedure is automated!

This procedure can now be done with the help of our script "fix-gpu-mdevs.sh" in our tools repo.

This is "disaster recovery" and should only be necessary in the event of a powerloss, or in the event of a sysadmin that forgot the proper way to reboot a GPU node (which is to shelve all VMs before reboot).

...