[BBLISA] accounting for I/O

Daniel Feenberg feenberg at nber.org
Thu Sep 1 17:07:36 EDT 2016



On Thu, 1 Sep 2016, Rob Taylor wrote:

> Have you tried iotop?
> It will tell you what processes are moving the most disk io at any given instant.
> Still might not get you what you want, but it might make it easier to narrow down.

iotop moves the sequential access processes to the top of the list, 
because a proces doing sequential access processes more kilobytes/second 
than one doing random access (because of cache hits, among other reasons). 
Our problem program is not "top" in iotop.

Actually, knowing the file name would probably be just as good as knowing 
the process, since we could find the owner of the file and contact them.

dan feenberg

>
> rgt
>
> Whitehead Network/System Administrator
>
> ----- On Sep 1, 2016, at 3:05 PM, Daniel Feenberg feenberg at nber.org wrote:
>
>> Apparently heavy random I/O overloaded our fileserver last week, and
>> response was very slow. We solved the problem with additional spindles,
>> but we are curious to know which process is doing the random I/O. Perhaps
>> we could approach that user with an offer to help improve their turnaround
>> time by changing the code. Our users are mostly inexperienced students so
>> the possibility of suboptimal code is certainly there. Most usage is
>> sequential access to very large files that does not load the fileserver
>> much at all so this has been a new experience for us.
>>
>> We can easily track bytes/second but a process doing random I/O may use
>> very few bytes/second, but still occupy much of the fileservers capacity,
>> so it hasn't been fruitful to identify the processes doing the most reads
>> and writes. During the period of overload, few disks were showing more
>> than kilobytes/second of read or write, yet iostat revealed that several
>> disks were continuously at 100%.
>>
>> A program such as iostat will tell us which physical disk is busy, lsof
>> will tell us which file is open by which process, netstat and nfstat will
>> give aggregate statistics over all processes, but I can't find a program
>> that will tell us which process is occupying the fileservers attention
>> with expensive requests.
>>
>> We couldn't replace all the disks with SSD, but might be able to provide
>> SSD for some files, if we could identify the culprits.
>>
>> Daniel Feenberg
>>
>> _______________________________________________
>> bblisa mailing list
>> bblisa at bblisa.org
>> http://www.bblisa.org/mailman/listinfo/bblisa
>



More information about the bblisa mailing list