[BBLISA] rsync vs dump for in-use files?

Tom Metro tmetro+bblisa at vl.com
Sun Sep 30 13:33:25 EDT 2007


Edward Ned Harvey wrote:
> I suppose you could attempt LVM snapshots, but that's crappy at best.

I haven't gone looking for yet, but I've yet to run across an account of 
anyone actually using LVM snapshots. I'm sure its happening, but it 
doesn't seem to be popular.


> ...or some other device with a file system more intelligent than
> EXT3, which is able to do filesystem snapshots...

Supposedly XFS supports snapshots, though I haven't looked into how they 
are implemented.

ZFS has been recently discussed on this list. I recently read about 
Btrfs[1], which is a ZFS-like file system developed by Oracle that's 
being contributed to Linux, but neither is ready for production use on 
Linux.

The previously mentioned Netapp is your best option if you need properly 
implemented snapshots and can afford it. Solaris or FreeBSD using ZFS 
would be the low cost route.

1. http://www.sdtimes.com/article/LatestNews-20070801-43.html


> Imagine you have a program, such as mysqld, which opens files read-write,
> and keeps them that way through the entire operation of the process.  At no
> time is a complete file ever written, and at no time is the file ever
> closed.  It is 100% impossible to backup that file, any more recently than
> it was opened.  But filesystems that do snapshotting (ZFS, Netapp, some
> others) can at least allow you to backup  the file as it was, just before
> the most recent time it was opened.

Are you sure that's how it works?

As far as a single file is concerned, inconsistencies come about due to 
buffers not being flushed to disk, so you might have a transaction half 
written to disk and half in the buffer. But it doesn't matter whether 
the file is left open - you'll still be able to get what has been 
written to disk since the file was opened.

The bigger problem that leads to the desired for snapshotting is 
consistency among multiple files. It's easy to see how a database might 
be writing related information to multiple files (like data to the 
database file and transaction logging to another file), and in a 
traditional backup time will have passed from the time when the first 
file is copied and when the last file in the set is copied.

You can address single file inconsistency, whether you have snapshotting 
or not, with cooperation of the application - having it flush its 
buffers or momentarily close its files. Snapshotting helps by permitting 
this application downtime to be kept to a minimum.

I believe for well designed applications, like a database with a binary 
transaction log, if the log and DB file are captured simultaneously, it 
doesn't matter if the buffers have been fully flushed on the DB file, as 
the transaction log (which gets flushed frequently by design) will 
correctly reflect the transactions that have been successfully written 
to disk. So in this case snapshotting permits you to avoid any downtime, 
even the few seconds it takes to make the snapshot.


> No matter what you do, if people (or processes) keep their files open
> indefinitely and never close, the most recent changes to that file are at
> risk.

What's still in memory, yes.

Fortunately UNIX-like systems don't have to jump through hoops to get 
around file locking, which tends to thwart backups on Windows.

  -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/




More information about the bblisa mailing list