[BBLISA] Large scale log processing

Mike Sprague mfs at komerex.com
Fri May 15 09:25:11 EDT 2009


Hi folks,

Long time listener, first time caller. :-)

I work for a web hosting company with about a thousand linux servers.
We're discussing options on how to process the logs mainly from our mail
and web servers to make troubleshooting easier.  We're not really
looking for long term storage; just a better way to be able to search
the logs to diagnose either specific customer issues, broad system
attacks, issues across a pool of servers or issues with a specific server.

One obvious solution is syslog-ng and a central log server.  While I'm
sure this will work, there will still be a lot of data to search through
which could be time consuming.

A colleague mentioned hadoop/MapReduce (http://hadoop.apache.org/).  On
the surface, this seems like it might be a good fit, but I don't have
any experience with it.

I was hoping y'all could give some suggestions on what you use for this
stuff and your opinion on how well it works.  I'm not looking for hard
answers, just some suggestions on where we should research for a
possible solution.  Any pointers are appreciated.

I'd be happy to post a summary back here if y'all are interested.

Thanks,
mikeS

-- 
Michael F. Sprague
mfs at komerex.com




More information about the bblisa mailing list