[BBLISA] Fwd: Moving 100 GB and 1.3 million files

Theo Van Dinter felicity at kluge.net
Thu Jul 22 14:52:26 EDT 2010


How are you moving these files between systems?  Is there a process
you can strace/etc to see what calls are being made?  Are files being
overwritten/updated on the remote side or are these new files?  Can
you narrow down the issue to the sender or receiver?

Some things I'd be concerned about: does the process need to compare
file data/do diffs?  are the writes syncronous/are there fsync() calls
when writing the files out?  are there any obvious bottlenecks --
multiple people have noted network is not it, but is there something
else on the system that's limited?


On Thu, Jul 22, 2010 at 2:43 PM, Ian Stokes-Rees
<ijstokes at crystal.harvard.edu> wrote:
>
>
> On 7/22/10 2:34 PM, Rudie, Tony wrote:
>> As a couple of people have said, it's not the gear that's the
>> problem, it's all those little files.  But following in your
>> footsteps, doing the same calculation based on 1.3 million files, we
>> get:
>>
>> 50% done = 650K files, in 20 * 3600 seconds = 9 files per second.
>> That seems low as well.  I just unpacked a tar file with 1000 files
>> in it in 3 seconds.
>
> I should have included those numbers in my first email.  Yes, I've
> looked at those, and they seem stupidly bad, like the system is running
> orders of magnitude slower than it should.
>
> What keeps frustrating me is that no one can describe to me or point me
> at a web page that describes the tools and techniques required to figure
> out what is going on.
>
> Ian
>
> _______________________________________________
> bblisa mailing list
> bblisa at bblisa.org
> http://www.bblisa.org/mailman/listinfo/bblisa
>



More information about the bblisa mailing list