[BBLISA] simpler alternative to Nagios

Bill Bogstad bogstad at pobox.com
Wed Sep 1 17:09:18 EDT 2010


On Wed, Sep 1, 2010 at 3:10 PM, Dean Anderson <dean at av8.com> wrote:
> Oh yeah. I forgot to point out in my last message that by responding in
> the interrupt handler, response to ping _does_not_ mean the network
> stack is functioning.

In an implementation where  ICMP echo is handled in an interrupt
process that is true.  However, unless all you are using your OS for
is packet forwarding, the differences between a kernel crash and a
homicidal init process are pretty insignificant.   Ping will still
work in either case.  But no user processes will occur.

>> The kernel is traditionally divided into a lower half (interrupt
> handlers and code to support interrupt handlers) and an upper half
> (everything else).  The modularity of the kernel depended on the machine
> independent code all being upper-half and drivers handling the relation
> between upper/lower.  When the kernel crashes, it is a catastrophic
> failure of code either in the upper half or lower half.  If the upper
> half fails and e.g., holds a critical lock and goes into an infinite
> loop as opposed to panic(), the lower half interrupt handlers may
> continue to operate.  The network stack is in the upper half. Nothing
> can run, but linux will still ping.

I took a quick look at the current Linux network stack after your
first note.   I didn't trace down all the details, but it appears
likely that no matter what it was like in the past, this is no longer
the case.
OTOH, other then the code complexity issues, I don't think ICMP ECHO
in an interrupt handler is either a good or a bad thing.  Just
different.  People need to check their assumptions at the door when
dealing with different systems.

Bill Bogstad



More information about the bblisa mailing list