[BBLISA] backups

John Stoffel john at stoffel.org
Wed Dec 15 10:48:31 EST 2010


>>>>> "Ryan" == Ryan Pugatch <rpug at linux.com> writes:

Ryan> On 12/14/2010 04:31 PM, John Stoffel wrote:
>> Do you have money to spend?  How much data?
>> 

Ryan> Yes, we have money to spend toward a solution.  We're talking about 65T 
Ryan> of data.  25T are archived logs and photos (so there is no change, just 
Ryan> additions)

Ryan> getting offsite via 100Mbit or 1GigE
>> 
>> Yeah, you have money.  :]  Though I bet you're not going too far
>> offsite with your setup at those prices.

Ryan> Going to VA from MA.

Ok, you're doing a fair distance then, so you'll want to tune your TCP
settings to maximize bandwidth.  Look into the "Bandwidth Delay
Product" which basically says that as your bandwidth or latency goes
up, your TCP send/recv windows *also* need to go up, or you'll  hit
performance issues.  You'll see this sawtooth bandwidth usage on your
logs.  

It's a pain to deal with.  Some of the Wan Acceleration vendors like
SilverPeak, Riverbed, Bluecoat, Cisco, Juniper, etc can help with
this. 

>> I personally still like Legato/EMC networker, but I haven't used it
>> for years.  The CLI tools are very well done so you can easily script
>> up reports or do stuff by hand.  And the browsing for restores is
>> excellent, best I've ever used out of:  Veritas NetBackup 5.x, Legato
>> Networker 4.x&  5.x, Bacula, Commvault 7.x and 8.x.
>> 
>> The big thing to remember is that no one cares about backups, they
>> only care about restores.  So a tool which makes restores easy and
>> painless is the best.

Ryan> True that!

>> Also, if you are using Netapp storage, it might be worth it to
>> snapmirror across your WAN link and then backup to disk remotely.  Not
>> cheap I admit, but certainly lets you keep your DR and restore options
>> lightning fast.

Ryan> Hoping to avoid expensive storage.. I'd love to use cheap Dell arrays.

Yeah, Netapp isn't cheap, and from the sound of it, you have lots of
data, but not all of it needs to be on high performance arrays.  

>> CommVault will do the D2D2D without a problem, but I'm not really
>> enamored of how they do things.  They have this Java client to do any
>> and all changes and it's just not nice at times.  They have an
>> interesting mindset (and they've been around for ages and ages, spun
>> off from AT&T) and it does work.  It's just... painful to wrap you
>> head around.

Ryan> CommVault is currently my top option.  They do dedupe,
Ryan> replication and I can put it all on cheap arrays.  It is pretty
Ryan> expensive though.

You haven't priced our EMC Networker then, have you?  Makes CommVault
look cheap.  *grin*

One thing CommVault might buy you is that you can do a single full
backup, then incrementals, then a Virtual Full at the remote site,
which builds a full backup without having to re-send all the data over
the WAN.  In your situation, where you just append data, this might be
a big win.  

>> Bacula is open source and fairly easy to setup, but... the restore
>> process is out of the 80s, and just not friendly.  You can buy
>> professional support, and maybe they have better tools on that side of
>> things.  I do use it at home, and it has saved my bacon, but it's not
>> what I think it should be.
>> 
>> Amanda.  Haven't used it for years and years.  It might now be called
>> zmanad or something.  You might want ot check it out.

Ryan> We're currently using Zmanda/Amanda but it has been horrible as
Ryan> we've grown.

I can see that.  Another option is rsync based backups too.  Esp over
the WAN, running a bunch in parallel, you should get ok performance.  

Another tool you can use to send data over large fat pipes is 'bbcp'
which takes data (preferably in large, single files) and sends it in
multiple parallel streams.  Using this, I can bury a dual T3 across
the country with data.  Still not ideal...

>> Veritas Netbackup.  I didn't like it when I used it, but I can't say I
>> used it in anger at all.  And it was over six years ago, so god knows
>> what's changed.

Ryan> You mean Symantec NetBackup now ;)

Yeah, whatever it is.  It's been years and honestly I can't say
anything good or bad about it since I've never really used it much.  

>> Are you looking for backup a single large system?  A bunch of smaller
>> stuff?  Money to burn?  Willingness to go opensource?

Ryan> Whatever works, basically.  We run lots of open source stuff, so
Ryan> that's not a problem.  That said, Amanda has been pretty rough
Ryan> for us.  We're definitely looking for truly enterprise grade
Ryan> backups.  Also, our teams run pretty small so it isn't like we
Ryan> have time to fight to make the backups work all of the time.

Yeah, CommVault *might* be your best bet, but if you disklike Windows,
then you're in trouble.  The core runs on a Windows Server, but the
Media Agents (MAs) can run on Linux/Solaris x86/Windows, which makes
management of them easier.  

Oh yeah, since you're thinking of doing D2D2D, CommVault (nor any
other vendor that I'm aware of) won't let you use NDMP to push data to
tape really quickly, so you're going to have to size your systems so
that they can push the data (10g pipes I bet) from one disk array to
another over a dedicated backup network.

And since you'll be having to basically do a 'find' down into the
filesystems to make the backup, being able to take a snapshot first,
then trawl the snapshot for changes and send those might be ideal.

Hmm... thinking about it more, you might want to look into using
Solaris and ZFS for your filesystems and storage, then use the zpool
send (or zfs send?) command to send snapshots of changed blocks
between systems.  Maybe.  Depends on what you use now and how much
change you can handle.

It's a fun, but potentially troublesome project to work on!  Good
luck.

John

P.S.  If you have CV questions, I can certainly help, and I'd also
recommend that you get the book "CommVault Storage Policies" as an
introduction to their mindset and how they work.  It won't replace
training, but it will give you a thorough grounding in the concepts
they work in, which isn't quite like anything else out there.



More information about the bblisa mailing list