[BBLISA] Fwd: Moving 100 GB and 1.3 million files

Ben Eisenbraun bene at klatsch.org
Thu Jul 22 15:00:00 EDT 2010


On Thu, Jul 22, 2010 at 02:43:50PM -0400, Ian Stokes-Rees wrote:
> What keeps frustrating me is that no one can describe to me or point me
> at a web page that describes the tools and techniques required to figure
> out what is going on.

I'm reasonably certain that the basic tools have not changed much in 10+
years, so any UNIX book that talks about performance should get you
started: top, iostat, netstat, nfsstat, etc.

Here are some questions you should answer:

- are both your network interfaces full speed and full duplex?  do they
  show any errors?
- what are the transfer speeds for a single large file?  100 MB, 500 MB, 1
  GB?
- what are the transfer speeds for smaller batches of your small files?
  can you get better speeds transferring 100/1000/10000 at a time?  is
  speed uniformly abysmal or does it start fast and drop off?

The reason to do this is because you need a repeatable test case that won't
take 3 days to run.  Find something that has crappy performance and can be
run a few times an hour so you can tweak things and test.

Try to isolate parts of the problem:

- moving files from host A to host B sucks; can you go the other way and
  get acceptable performance?
- move the source files to a memory file system (i.e. take the disks out of
  the equation).  does performance still suck?  how about sourcing from
  disk and moving to a memory file system?

Other questions:

- if you're using NFS, how are you moving the file?  cp?  mv?  tar?
- if you're using NFS, did you try rsync?  tar+ssh?  tar+nc?
- can you rule out the network as the problem?  if so, then look at your
  disks, RAID volumes, etc.

That's where I would start.  You can't solve problems like this without
putting in the time; there's no web page with a howto on solving this
problem.  It's why people hire UNIX sysadmins.

-ben

--
the cure for boredom is curiosity. there is no cure for curiosity.
                                                  <dorothy parker>



More information about the bblisa mailing list