Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

TimeEvent
16.12.23 - 20:22The broken floor tile was discovered
16.12.23 - ca. 23:00Agreed that we should remove some weight from R3, and contact help on Monday
18.12.23 - ca. 09:00Contacted a company that will assess the damage, and come up with a plan to fix the floor
18.12.23 - ca. 12:00

Placed a steel beam under R3, to support it.

Migrated all VMs from five of the compute nodes in R3, and removed them from the rack - meaning we are currently running on reduced capacity.

20.12.23Visit from the carpenter company. Made an initial plan for what needed to be done. Decided that the steel beam would suffice for support. Little to no measured further "sinking".
17.01.24 - 10:00

Meeting with the carpenter. A plan was made for repairing the floor. They will build support framing between all the floor tile legs, and replace all necessary tiles.

The carpenters will need two days, and we schedule two days for removing all servers, and one day to put everything back in after the floor is fixed. Meaning a total downtime of five days.

18.01.24 - 13:00Received confirmation from the carpenter, that they can start the repair work on 7th of February. We accepted the offer.
23.01.24 - 15:23Messaged all users about the planned downtime in week 6
05.02.24 - ca. 10:00Shutdown SkyHiGh, SkyLow and everything else. All servers has been removed from the racks.
08.02.24 - ca. 11:00Carpenter work finished.
08.02.24 - ca. 12:00Started the work on moving racks back in place and rewire fibre cables and environmental sensors.
08.02.24 - 16:00All network infrastructure (core and rack switches) is reinstalled and confirmed working as normal. Replaced a broken PDU, and all environmental sensors are confirmed working. 
09.02.24 - 08:30Started to place all SkyHiGh, NBL, DSE and Hansken servers back in their racks.
09.202.24 - 12:15Started ro restart infrastructure services in SkyHiGh. Databases, puppet infrastructure, message queues, ceph cluster etc.

...