The roads I take...
KaiRo's weBlog
| Displaying recent entries in English and tagged with "RAID". Back to all recent entries |
July 6th, 2009
Personal Server Outage Hopefully Solved
If you wondered why my blog, SeaMonkey development, SeaMonkey deutsch, www.kairo.at or any other of the sites from my server have been unavailable for most of the last day, here's the story of that toxic incident without going into much detail:
Thankfully, the backups were from ~5-6 hours before the system went down initially, so not too much lost, but it took me 10-12 hours until now, where everything seems to be alright. (I'm sure I'll discover a few small things in the next few days but things look alright on all major sites, mails flow, etc.)
If you sent mail to me between 3am and 9am CEST on July 5, it's very possibly lost, in other cases it should be there or coming in while the SMTP network realizes that this server is back with them.
I just hope the lost time for studying isn't too toxic for the exam tomorrow, which happens to be in - toxicology.
- The server lost knowledge of a harddisk,
- the (software/kernel) RAID 5 for our root filesystem choked,
- I forced a command to reassemble it and with that made it give us an I/O error (ouch),
- then we turned off the server, put out the disk,
- put it in again and it worked again,
- we recreated the RAID array only to discover that though this was fine, it didn't detect the filesystem on it any more,
- fsck.ext3 choked on it with about a million error messages about invalid journals and inodes,
- we reformatted it and reinstalled the whole system,
- restoring all important data from our backups.
Thankfully, the backups were from ~5-6 hours before the system went down initially, so not too much lost, but it took me 10-12 hours until now, where everything seems to be alright. (I'm sure I'll discover a few small things in the next few days but things look alright on all major sites, mails flow, etc.)
If you sent mail to me between 3am and 9am CEST on July 5, it's very possibly lost, in other cases it should be there or coming in while the SMTP network realizes that this server is back with them.
I just hope the lost time for studying isn't too toxic for the exam tomorrow, which happens to be in - toxicology.
By KaiRo, at 02:18 | Tags: outage, RAID, Server | 1 comment | TrackBack: 0