[BBLISA] Fwd: Moving 100 GB and 1.3 million files

David Allan dave at dpallan.com
Fri Jul 23 10:12:12 EDT 2010


On Fri, 23 Jul 2010, Edward Ned Harvey wrote:

>> From: bblisa-bounces at bblisa.org [mailto:bblisa-bounces at bblisa.org] On
>> Behalf Of David Allan
>>
>> Is my math right?  I'm calculating the OP is getting 650kbps
>> throughput.
>> That seems wrong for any local file transfer on modern gear.  I don't
>> believe my own calculation, though.
>>
>> 94GB, 50% complete = 47GB = 47000MB
>> 47000MB / 20 hr. = 2350MB/hr. = .652MB/s
>
> Without even checking your numbers, I'll say, your math is probably right,
> and your logic is probably wrong.
>
> Suppose you write a 1k file.  Suppose there's 9ms to create the file, and
> another 9ms to write the contents of the file, and another 9ms to update the
> journal.  (This is probably all an underestimate.)  Then you're only going
> to be able to write 1k every 27ms, which is 37 K/s.  Obviously very slow,
> and the reason is high latency to write a small piece of data to disk.

Sorry, "gear" was a bad choice of words on my part.  I should have said on 
any modern *infrastructure*: hardware and software (including what 
everybody is, I believe, correctly pointing out as the most likely 
culprit, the filesystem).  He gave raw throughput numbers and asked if he 
had a problem.  IMO, those numbers, without any additional information, 
are indicative of a problem.

Assuming that the problem is this filesystem under this workload, if your 
filesystem is only giving you 650kpbs throughput for a particular 
workload, that's a problem, undoubtedly one that can be fixed with careful 
analysis.  If it can't be fixed with this filesystem, then find a 
filesystem that doesn't suffer from those performance characteristics for 
that workload.

OTOH, doing a basic sanity check for network and storage contention is 
probably the place to start troubleshooting.  Ben's suggestion of testing 
the performance of the transfer of a single large file and other 
variations is an excellent way to gather more data about what kind of 
problem this might be before delving into more detailed data gathering.

Dave



More information about the bblisa mailing list