Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Openvswitch is upgraded on the three GPU nodes in question, and has been rebooted. As a natural side-effect, all VMs running on these servers was rebooted as well.

Event log

TimeEvent
28.02.23 - 06:11gpu304 lost network connectivity
28.02.23 - 06:28gpu302 lost network connectivity
28.02.23 - 06:35gpu301 lost network connectivity
28.02.23 - 08:00SkyHiGh operators arrived at work, and started working on the issue
28.02.23 - 09:24All three affected nodes was fixed and
operational
returned to production