[BBLISA] Fwd: Moving 100 GB and 1.3 million files
David Allan
dave at dpallan.com
Fri Jul 23 10:12:12 EDT 2010
On Fri, 23 Jul 2010, Edward Ned Harvey wrote:
>> From: bblisa-bounces at bblisa.org [mailto:bblisa-bounces at bblisa.org] On
>> Behalf Of David Allan
>>
>> Is my math right? I'm calculating the OP is getting 650kbps
>> throughput.
>> That seems wrong for any local file transfer on modern gear. I don't
>> believe my own calculation, though.
>>
>> 94GB, 50% complete = 47GB = 47000MB
>> 47000MB / 20 hr. = 2350MB/hr. = .652MB/s
>
> Without even checking your numbers, I'll say, your math is probably right,
> and your logic is probably wrong.
>
> Suppose you write a 1k file. Suppose there's 9ms to create the file, and
> another 9ms to write the contents of the file, and another 9ms to update the
> journal. (This is probably all an underestimate.) Then you're only going
> to be able to write 1k every 27ms, which is 37 K/s. Obviously very slow,
> and the reason is high latency to write a small piece of data to disk.
Sorry, "gear" was a bad choice of words on my part. I should have said on
any modern *infrastructure*: hardware and software (including what
everybody is, I believe, correctly pointing out as the most likely
culprit, the filesystem). He gave raw throughput numbers and asked if he
had a problem. IMO, those numbers, without any additional information,
are indicative of a problem.
Assuming that the problem is this filesystem under this workload, if your
filesystem is only giving you 650kpbs throughput for a particular
workload, that's a problem, undoubtedly one that can be fixed with careful
analysis. If it can't be fixed with this filesystem, then find a
filesystem that doesn't suffer from those performance characteristics for
that workload.
OTOH, doing a basic sanity check for network and storage contention is
probably the place to start troubleshooting. Ben's suggestion of testing
the performance of the transfer of a single large file and other
variations is an excellent way to gather more data about what kind of
problem this might be before delving into more detailed data gathering.
Dave
More information about the bblisa
mailing list