Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

On the evening of December the 15th, we discovered that a few of the floor tiles in our server room has broken. The result is that two of our racks now has a leg with no support, and the racks has become "a bit" tilted (see picture below). Awaiting carpenter work, we have removed five compute nodes from the most tilted rack to reduce weight, and the rack has been supported by a steel beam.

...

Planned implemented fix

  • Shutdown and remove all servers from all eight racks. Leave the switches and cabling. Move the racks, so the carpenters can rebuild the floor.
  • Build support framing for the legs that supports the floor tiles. Replace the floor tiles with thicker tiles.
  • Put all servers back in the racks, and make sure everything still works as normal.

Timeline for maintenance

All dates are "latest". Meaning it might happen earlier depending on when the carpenter work finish etc.

DateEvent
05-06.02.24Shutdown all servers and remove them from the racks
07-08.02.24Carpenter work. Build support framing, replace tiles
09.02.24Insert all SkyHiGh servers back in the racks, and resume production
12.02.24Insert the rest of IIK's servers back, and resume production (NBL, Hansken, DSE)
12-16.02.24Place all SkyLow servers back, and open for external users with servers in the outside part of K001.

Event log

TimeEvent
16.12.23 - 20:22The broken floor tile was discovered
16.12.23 - ca. 23:00Agreed that we should remove some weight from R3, and contact help on Monday
18.12.23 - ca. 09:00Contacted a company that will assess the damage, and come up with a plan to fix the floor
18.12.23 - ca. 12:00

Placed a steel beam under R3, to support it.

Migrated all VMs from five of the compute nodes in R3, and removed them from the rack - meaning we are currently running on reduced capacity.

20.12.23Visit from the carpenter company. Made an initial plan for what needed to be done. Decided that the steel beam would suffice for support. Little to no measured further "sinking".
17.01.24 - 10:00

Meeting with the carpenter. A plan was made for repairing the floor. They will build support framing between all the floor tile legs, and replace all necessary tiles.

The carpenters will need two days, and we schedule two days for removing all servers, and one day to put everything back in after the floor is fixed. Meaning a total downtime of five days.

18.01.24 - 13:00Received confirmation from the carpenter, that they can start the repair work on 6th 7th of February. We accepted the offer.
23.01.24 - 15:23Messaged all users about the planned downtime in week 6

...