[BBLISA] System Backup thoughts and questions...

Daniel Feenberg feenberg at nber.org
Thu Jan 8 17:55:31 EST 2009



On Thu, 8 Jan 2009, Richard 'Doc' Kinne wrote:

> Hi Folks:
>
> I'm looking at backups - simple backups right now.
>
> We have a strategy where an old computer is mounted with a large external, 
> removable hard drive. Directories - large directories - that we have on our 
> other production servers are mounted on this small computer via NFS. A cron 
> job then does a simple "cp" from the NFS mounted production drive partitions 
> to to the large, external, removable hard drive.
>
> I thought it was an elegant solution, myself, except for one small, niggling 
> detail.
>
> It doesn't work.
>
> The process doesn't copy all the files. Oh, we're not having a problem with 
> file locks, no. When you do a "du -sh <directory>" comparison between the 
> /scsi/web directory on the backup drive and the production /scsi/web 
> directory the differences measure in the GB. For example my production /scsi 
> partition has 62GB on it. The most recently done backup has 42GB on it!

I can think of reasons a filesystem might grow as it was copied, such as 
hard or soft links on the source, or sparse files on the source, but it is 
harder to think of reasons for a filesystem to shrink.

If a file is unlinked (removed) while another program has it open, the 
file will disappear from the directory structure, while still occupying 
space on the disk untill the running program ends or closes the unit. That 
can't explain your problem, though, since it would affect df but not du.

>
> What our research found is that the cp command apparently has a limit of 
> copying 250,000 inodes. I have image directories on the webserver that have 
> 114,000 files so this is the limit I think I'm running into.
>

Very hard to believe any modern Unix has a fixed limit on the number of 
files that can be copied. cp has no need to keep any state for copied 
files, so while I haven't looked at the source, I wouldn't think it would 
even count the number of files. On our recent FreeBSD machines we have 
copied filesystems with millions of files.

What version of what Unix are you using? First thing to get to the bottom 
of this is to determine if you are missing files, or parts of files on the 
target. Could this be a problem with files over 2GB in size on an older 
filesystem type? In the deep distant past there were no doubt cp utilities 
that failed on large files.

You could run du (without the -s) on each filesystem and compare the 
results, probably with diff (but I haven't tried that) to find out which 
files lost content or didn't get copied. Are they the largest? All of the 
same type or in the same directories? Compare the numbers of blocks with 
the file lengths from ls to confirm.

Is the large external drive a USB or Firewire drive? Could it be a driver 
problem? Have you looked at /var/messages for NFS error messages during 
the copy?

> While I'm looking at solutions like Bacula and Amanda, etc., I'm wondering if 
> RSYNCing the files may work.  Or will I run into the same limitation?
>

rsync and the others should work, and have the advantage that if a file on 
the source filesystem becomes silently corrupted, you won't copy the 
corrupt file over the good backup on the next run. That makes them better 
than the traditional tar for backups - also faster.

Daniel Feenberg

> Any thoughts?
> ---
> Richard 'Doc' Kinne, [KQR]
> American Association of Variable Star Observers
> <rkinne @ aavso.org>
>
>
>




More information about the bblisa mailing list