[BBLISA] accounting for I/O

Rich Braun richb at pioneer.ci.net
Thu Sep 1 18:33:36 EDT 2016


>>>> During the period of overload, few disks were showing more
>>>> than kilobytes/second of read or write, yet iostat revealed that several
>>>> disks were continuously at 100%.

When I see this situation on bare-metal hardware, I first suspect a disk problem. Failing disks often lead to this kind of symptom prior to dying completely: they'll perform block-repair operations that cause a drive to handle only a few KB/sec as reported here. And then the symptom goes away for...days? Months? And then--boom, server-fail.

Check the drives' health with smartctl. Look for a nagios checker script that can continually monitor smart metrics and temperature.

-rich


More information about the bblisa mailing list