[BBLISA] statistics-based zero config network management: why doesnt this exist?

Sat Aug 3 16:09:05 EDT 2013

90% of the data available in SNMP isn't generally relevant, and it can 
be massive on some systems and take a long time to poll.

Zenoss seems to have a decent auto-discovery system, and will do its 
best to detect the type of system and apply a template. That template 
defines what is relevant to monitor. I personally didn't like it as it 
was rather complicated to do anything "out of the box", and the kind of 
monitoring I generally dealt with needed more flexibility (based on my 
evaluation).

Aside from auto-discovery, I like cacti for its presentation and seeing 
trend data. I like nagios for its flexibility and notification handling, 
but I think it requires a lot of scripting to use effectively. I may be 
investigating soon some autodiscovery-type systems, or design one on my 
own...I do like that nagios can be provisioned by outside systems - even 
multiple ones.

On 8/3/2013 3:52 PM, Alex Aminoff wrote:
>
> I'm looking at SNMP-based network monitoring systems: cacti, zabbix,
> some other similar ones. All of them seem to require you to configure
> your devices on the system. There are some auto-discovery functions, but
> they only work if you have loaded up the "profile" or "template" for
> your particular network hardware.
>
> So why is this necessary? Suppose instead there was a network monitoring
> system that worked like this:
>
>    - Find any SNMP device on your subnet
>    - Walk its SNMP tree, collecting all data, no matter what it is:
> interface counters, manufacturer's serial number, I dont care
>    - Save this data in some sort of time series storage, like RRD
>    - Then use statistics to throw an alert when a new value (or more
> likely a group of new values) differs sufficiently in statistical terms
> from the history of that value.
>
> The great thing about this plan is you don't need to configure in
> advance the MIBs and OIDs. When an alert happens, the system can include
> the OID in the message. A human can then look it up or otherwise deal.
>
> There will be false positives, but one should be able to filter those
> out once they happen. A real network problem in my experience involved
> some values jumping from 0-1-2-0 to 1,234,567 so you can dial the
> sensitivity way down on the statistical tests.
>
> My question is, why does this not exist? Is there some reason I have
> overlooked why this would be impractical? Or does it exist and I just
> have not found it?
>
>    - Alex
>
> _______________________________________________
> bblisa mailing list
> bblisa at bblisa.org
> http://www.bblisa.org/mailman/listinfo/bblisa
>