0

We have a dedicated Debian X64 server, running projects and we require availability under any situation. Since last 2 days we have noticed that the server went offline for 10-15 minutes, and then it was back again.

Nothing is more problematic than a server which surprisingly is offline as high availability is one of our requirement.

Is there any way we can be notified if the server is offline. I checked for something similar and found this, but this checks periodically, I would like to know how much time it went offline as soon as the server is up and running again.

Thank you.

We are Borg
  • 375
  • 3
  • 8
  • 21
  • For periodic external checking (can be each minute if run as a cron job) there is a bash program at http://www.timedicer.co.uk/programs/help/tiny-device-monitor.sh.php - with -c option it notifies you only of changes in status i.e. offline/online. – gogoud Jun 30 '16 at 15:08

1 Answers1

1

In general, availability monitoring must be done externally, and in most situations you'd do it in regular intervals whose length depends on your needs. Do you know the cause of the outage in question? Was it a server reboot or network interruption? If you have unexpected reboots, you could let the server send you mail with relevant logging data right after it reboots.

Hans-Martin Mosner
  • 1,802
  • 1
  • 9
  • 11
  • It was network interruption from the provider side. Reboot would be a big problem as the infrastructure has to rebooted in a way and stuff to be done post-boot. – We are Borg Jun 30 '16 at 11:24
  • 1
    Work in a way to eliminate the need to do things after rebooting or automating the "stuff" – Rui F Ribeiro Jun 30 '16 at 11:57
  • 1
    100% agree with @RuiFRibeiro - if the server requires any manual stuff to be done after every reboot, then whoever set it up has not finished their job. The only exception is if a password has to be entered to mount an encrypted filesystem (scripting the password means having it in plain-text on the system itself, which defeats the point of fs encryption) – cas Jul 01 '16 at 02:39