Faster and better updates of status page...
Short, concise description of the idea
The status page needs to be updated faster, and with a more clear idea of what's going on.
Full description of the idea
I'll give a for instance: Multiple journals (and some portions of the site) have been dropping "database unavailible" errors since around 4:45am CST (09:00 GMT). The status page still listed information from the 27th of October (two days previous). It was finally updated to reflect the fact that the porkchop cluster is undergoing maintenance and is unavailible at apx 8:00am CST (13:00 GMT), a full three (plus) hours after issues began.
Not only was this a significant lag - in which we were left wondering - but the status page mentions only ongoing maintenance. Obviously maintenance on a cluster should not take four hours, and if it does it should be planned. If it is planned, a listing in news would be appreciated (or something similar). If it is a major issue requiring extended downtime (i.e. something broke), the status page should reflect at least a brief description of the problem at hand (i.e. "porkchop is broke, we're attempting to fix the cluster, journals therein will be unavailible until it is fixed, sorry."), instead of just saying "...undergoing maintenance..." as that makes us (myself, at least) think of a quick fix here and there... not a rediculously extended downtime.
(Rediculously extended downtimes are fine, every website has them occasionally as things break and need to be repaired. That, I don't mind. Letting us KNOW, however, would be nice. If this WAS planned maintenance, I can find no record of it that I - as a user - have access to.)
Additionally, it may help to include a detailed commentary (such as "The database in unavailible due to maintenance. We hope this will only take a short time, however you will be unable to comment/login/use your journal during this time" etc...), or auto-refresh to the status page when attempting to access a journal on a downed cluster.
- Users would know the true issue behind a problem, and might not be as upset if things are back to working pronto.
- Staff may be able to cut down on the number of support requests during extended downtime.
- When extended downtime maintenance is planned, users would be well aware of it and not be as worried.
- When major issues do happen, users would feel that there is more communication from the staff - as opposed to brushing us off with promises of simple maintenance that will be over soon.
- Users would be informed MUCH FASTER of downtime, and not be left wondering for several hours.
An ordered list of problems/issues involved
- Staff would have to think/work slightly longer at the status messages instead of tossing pre-canned posts onto the status page, this may impact time availible to work on issues (though hopefully not by more than a minute or so).
- Extended maintenance would have to be scheduled and notification given each time via news posting, etc..
- Staff would have to de-techify issues when posting to the status page.
- Some major code may have to be edited to implement the refresh function.
An organized list, or a few short paragraphs detailing suggestions for implementation
- Appoint multiple people the duty of updating the status page the moment an issue is discovered, and re-updating it with progress reports during work (as time permits), including more information than just "undergoing maintenance." This ensures that someone who gets called in with be able to do so, and the word would get out quickly.
- Make status page updating the first thing that is done when an issue is discovered - it shouldn't take long (if it does, make is easier/faster).
- Make extended maintenance - and scheduled maintenance - in news, so that everyone would see it (friends page, if not on the front page of the site), and update the status page a good few hours in advance.
- Include drop in the main page or userinfo pages of a scaled-down version of the status message (possibally at the top of the page, or the very bottom).
- Find a way to add a link to the status page (or direct-refresh to it) instead of "Sorry, database unavailible" when accessing downed journals/areas.