Around 4am uk/London time on the 31st December 2021 a power loss was logged on the server known as STOR3 (Node23) which caused an outage, while waiting for engineers on site to conduct further analysis the server has gone down again around 12:34pm uk/London time.
On this occasion the chassis has recorded a failed memory module, as such a number of components need to be checked and replaced by engineers on site, for the integrity of everyones data we have decided to leave this server powered down until a final resolution is achieved as allowing it to run with sudden failure potential could impact the raid array and run a very real risk of complete data loss.
Apologies for the inconvenience but this server has 100's of TB of data on it and recovery of a raid array this size is not within the risk appetite, as such we have elected for an outage instead until IOflood engineers are able to conclude their diagnostics and resolve the hardware issue.
The server was allowed to briefly boot to a stage far enough to verify the integrity of the raid before pausing it and shutting the server back down.
As soon as on site engineers have finished their work the server will be powered on and service restored.
Any updates will be placed here as required.
Update: 20:45 the defective ram was identified and replaced. The server is up and running and services restored.
Friday, December 31, 2021