linux – 由于未知原因的高iowait

我很难弄清楚导致我的服务器上的高iowait的原因.

这是iostat -xm 5 5的日志

Linux 2.6.32-358.6.1.el6.x86_64 (prod-1.localdomain)    09/28/2013  _x86_64_    (16 cpu)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.98    0.05    3.72    3.54    0.00   85.71

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.08   120.88   30.27    1.72     0.96     0.48    92.20     0.34   10.67   3.79  12.13
sda               7.63    37.19    8.96    4.89     0.35     0.16    76.40     0.16   11.63   2.19   3.04

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.41    0.00    6.20   37.65    0.00   50.74

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00   109.80  186.20    1.40     3.75     0.43    45.66    98.21  519.80   5.33 100.00
sda              33.20     3.40   18.00    2.00     0.37     0.02    40.32     0.07    3.41   3.17   6.34

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.55    0.00    7.42   30.06    0.00   56.97

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00  196.00    0.00     3.91     0.00    40.85   100.41  506.01   5.10 100.00
sda               0.00     2.40    1.80    2.60     0.05     0.02    30.91     0.01    2.95   2.73   1.20

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.71    0.00    7.04   31.76    0.00   55.49

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00   100.00  189.00    1.20     3.72     0.40    44.33    95.32  514.88   5.26 100.00
sda              33.20     4.20   19.20    5.20     0.39     0.04    35.80     0.02    1.01   0.79   1.92

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          61.93    0.00   10.08   14.99    0.00   12.99

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.40  185.40    1.40     3.76     0.01    41.31    83.22  431.16   5.28  98.62
sda              33.20     5.40    9.60    4.00     0.21     0.04    37.65     0.02    1.24   1.04   1.42

如您所见,除了等待和%util非常高之外,所有指标都是正常的.
所以我认为/ dev / sdb可能有问题.

但是smartctl报告没有有用的信息.

smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-358.6.1.el6.x86_64] (local build)
copyright (C) 2002-12 by Bruce Allen,http://smartmontools.sourceforge.net

=== START OF informatION SECTION ===
Model Family:     Western Digital RE4 Serial ATA
Device Model:     WDC WD2003FYYS-02W0B1
Serial Number:    WD-WMAY04093732
LU WWN Device Id: 5 0014ee 05877b196
Firmware Version: 01.01D02
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sat Sep 28 09:05:30 2013 ICT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The prevIoUs self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (29160) seconds.
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 283) minutes.
Conveyance self-test routine
recommended polling time:    (   5) minutes.
SCT capabilities:          (0x303f) SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_Failed RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   253   253   021    Pre-fail  Always       -       9100
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       42
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   090   090   000    Old_age   Always       -       7373
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       40
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       31
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       10
194 Temperature_Celsius     0x0022   123   107   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests,use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans,do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up,resume after 0 minute delay.

我被困在这里,不知道解决问题的下一步行动.

任何帮助将不胜感激!

更新:

@MichaelHampton

我的自测日志,没有兴趣信息. smartctl -l selftest / dev / sdb

smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-358.6.1.el6.x86_64] (local build)
copyright (C) 2002-12 by Bruce Allen,http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      7380

@Mife

我的pidstat -d 1 30结果.

Linux 2.6.32-358.6.1.el6.x86_64 (cass-23_120.localdomain)   09/28/2013  _x86_64_    (16 cpu)

05:57:43 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command

05:57:44 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command

05:57:45 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:46 PM      1555    736.00      0.00      0.00  java
05:57:46 PM     16698      0.00      4.00      0.00  java

05:57:46 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:47 PM       552      0.00     68.00      0.00  jbd2/sda3-8
05:57:47 PM      1555    352.00      0.00      0.00  java
05:57:47 PM     16698      0.00     12.00      0.00  java
05:57:47 PM     18074      0.00      4.00      0.00  java
05:57:47 PM     19295   1564.00      0.00      0.00  java

05:57:47 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:48 PM      1554   3128.00      8.00      4.00  xinetd
05:57:48 PM      1570    840.00      0.00      0.00  gmond
05:57:48 PM      2183      0.00      4.00      0.00  java
05:57:48 PM      2394     64.00      0.00      0.00  rsync
05:57:48 PM      2395    324.00      0.00      0.00  ssh
05:57:48 PM     13280     28.00      0.00      0.00  downloadm_new.s
05:57:48 PM     19295   1724.00      0.00      0.00  java

05:57:48 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:49 PM     19295   1744.00      0.00      0.00  java

05:57:49 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:50 PM      1083      0.00      8.00      0.00  flush-8:0
05:57:50 PM      1086      0.00      8.00      0.00  java
05:57:50 PM      2183      0.00     12.00      0.00  java
05:57:50 PM     13280    388.00      0.00      0.00  downloadm_new.s
05:57:50 PM     18074      0.00      4.00      0.00  java
05:57:50 PM     19295   1728.00      0.00      0.00  java

05:57:50 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:51 PM      2183      0.00      4.00      0.00  java
05:57:51 PM      2400      8.00      0.00      0.00  sleep
05:57:51 PM     18074      0.00      4.00      0.00  java
05:57:51 PM     19295   1680.00      0.00      0.00  java

05:57:51 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:52 PM       552      0.00     28.00      0.00  jbd2/sda3-8
05:57:52 PM      1112      0.00      4.00      0.00  jbd2/sda4-8
05:57:52 PM      2183      0.00      8.00      0.00  java
05:57:52 PM     16698      0.00      4.00      0.00  java
05:57:52 PM     18074      0.00      4.00      0.00  java
05:57:52 PM     19295   1672.00      0.00      0.00  java

05:57:52 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:53 PM      1555    376.00     20.00      0.00  java
05:57:53 PM      1570    792.00      0.00      0.00  gmond
05:57:53 PM     19295   1568.00      8.00      0.00  java

05:57:53 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:54 PM      3734    844.00    188.00      0.00  java
05:57:54 PM     19295   1672.00      0.00      0.00  java

05:57:54 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:55 PM      1083      0.00     20.00      0.00  flush-8:0
05:57:55 PM      3734   2620.00   1156.00      0.00  java
05:57:55 PM      4327      0.00      8.00      0.00  java
05:57:55 PM      9677      0.00      8.00      0.00  java
05:57:55 PM     16613      0.00      8.00      0.00  java
05:57:55 PM     19295   1272.00      8.00      0.00  java
05:57:55 PM     19426      0.00      8.00      0.00  java

05:57:55 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:56 PM      3734   3592.00   1200.00      0.00  java
05:57:56 PM     19295    332.00      0.00      0.00  java
05:57:56 PM     19426      0.00      4.00      0.00  java

05:57:56 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:57 PM       552      0.00     36.00      0.00  jbd2/sda3-8
05:57:57 PM      2405   1068.00     32.00      0.00  java
05:57:57 PM      3734   2972.00    828.00      0.00  java
05:57:57 PM      5457      0.00      8.00      0.00  java
05:57:57 PM      9677  28424.00    144.00     20.00  java
05:57:57 PM     16698      0.00      8.00      0.00  java
05:57:57 PM     18074      0.00      4.00      0.00  java
05:57:57 PM     19295      0.00      4.00      0.00  java

05:57:57 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:58 PM      2183      0.00      4.00      0.00  java
05:57:58 PM      2222      0.00      4.00      0.00  pidstat
05:57:58 PM      2405    500.00      0.00      0.00  java
05:57:58 PM      3734   4016.00    720.00      0.00  java
05:57:58 PM      5457      0.00      8.00      0.00  java
05:57:58 PM     16698      0.00      4.00      0.00  java

05:57:58 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:59 PM      1112      0.00      8.00      0.00  jbd2/sda4-8
05:57:59 PM      3734   4572.00    372.00      0.00  java

05:57:59 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:00 PM      1083      0.00     32.00      0.00  flush-8:0
05:58:00 PM      2405    496.00      0.00      0.00  java
05:58:00 PM      3734   5412.00      4.00      0.00  java
05:58:00 PM      5457      0.00     16.00      0.00  java
05:58:00 PM     11681      0.00      8.00      0.00  java
05:58:00 PM     14824      0.00      8.00      0.00  java
05:58:00 PM     16698      0.00     12.00      0.00  java
05:58:00 PM     17694      0.00      8.00      0.00  java
05:58:00 PM     18074      0.00     12.00      0.00  java
05:58:00 PM     18129      0.00      8.00      0.00  java
05:58:00 PM     19542      0.00      8.00      0.00  java

05:58:00 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:01 PM      3734   3888.00      0.00      0.00  java
05:58:01 PM      3813      8.00     12.00      0.00  java
05:58:01 PM     13280     28.00      0.00      0.00  downloadm_new.s
05:58:01 PM     18074      0.00      8.00      0.00  java

05:58:01 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:02 PM       552      0.00     44.00      0.00  jbd2/sda3-8
05:58:02 PM      1129      0.00     16.00      0.00  jbd2/sdb1-8
05:58:02 PM      2405    256.00      0.00      0.00  java
05:58:02 PM      3734   1200.00   1128.00      0.00  java
05:58:02 PM     16698      0.00      4.00      0.00  java
05:58:02 PM     18074      0.00      8.00      0.00  java

05:58:02 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:03 PM      1570   1172.00      0.00      0.00  gmond
05:58:03 PM      2183      0.00      4.00      0.00  java
05:58:03 PM      2405    256.00      0.00      0.00  java
05:58:03 PM      2442     92.00      0.00      0.00  rsync
05:58:03 PM      2443    916.00      0.00      0.00  ssh
05:58:03 PM      3734    576.00      0.00      0.00  java
05:58:03 PM      9677      0.00      4.00      0.00  java
05:58:03 PM     13280     56.00   8996.00      0.00  downloadm_new.s

05:58:03 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:04 PM      2183      0.00      4.00      0.00  java
05:58:04 PM      2405    256.00      0.00      0.00  java
05:58:04 PM      2443      8.00      0.00      0.00  ssh
05:58:04 PM      3734   2032.00     16.00      0.00  java

05:58:04 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:05 PM      1083      0.00      4.00      0.00  flush-8:0
05:58:05 PM      2405    224.00      0.00      0.00  java
05:58:05 PM      2446    160.00      0.00      0.00  sleep
05:58:05 PM      3734   5344.00    648.00      0.00  java
05:58:05 PM      3813      0.00      8.00      0.00  java
05:58:05 PM     13280   1016.00      0.00      0.00  downloadm_new.s

05:58:05 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:06 PM      2405     16.00      0.00      0.00  java
05:58:06 PM      3734   6196.00    344.00      0.00  java

05:58:06 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:07 PM       552      0.00      8.00      0.00  jbd2/sda3-8
05:58:07 PM      2405    112.00      0.00      0.00  java
05:58:07 PM      3734   3532.00      0.00      0.00  java
05:58:07 PM     16698      0.00      4.00      0.00  java
05:58:07 PM     18074      0.00      4.00      0.00  java

05:58:07 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:08 PM      1570   1172.00      0.00      0.00  gmond
05:58:08 PM      2183      0.00      4.00      0.00  java
05:58:08 PM      2405    352.00      0.00      0.00  java
05:58:08 PM      3734   4588.00      0.00      0.00  java
05:58:08 PM     16698      0.00      8.00      0.00  java

05:58:08 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:09 PM      2222      0.00      4.00      0.00  pidstat
05:58:09 PM      2405    368.00      0.00      0.00  java
05:58:09 PM      3734   1720.00      0.00      0.00  java
05:58:09 PM     16698      0.00      4.00      0.00  java

05:58:09 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:10 PM      1083      0.00      8.00      0.00  flush-8:0
05:58:10 PM      2405    480.00      0.00      0.00  java
05:58:10 PM      3734     40.00     16.00      0.00  java
05:58:10 PM     17768      0.00      8.00      0.00  java

05:58:10 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:11 PM      2405    608.00      0.00      0.00  java
05:58:11 PM      3734    264.00      0.00      0.00  java
05:58:11 PM     19426      0.00      4.00      0.00  java

05:58:11 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:12 PM      1129      0.00     24.00      0.00  jbd2/sdb1-8
05:58:12 PM      2405    240.00      0.00      0.00  java
05:58:12 PM     18074      0.00      8.00      0.00  java

05:58:12 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:13 PM      1570   1172.00      0.00      0.00  gmond
05:58:13 PM      2183      0.00      4.00      0.00  java
05:58:13 PM      2405    128.00      0.00      0.00  java
05:58:13 PM     18074      0.00      4.00      0.00  java

Average:          PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
Average:          552      0.00      6.13      0.00  jbd2/sda3-8
Average:         1083      0.00      2.40      0.00  flush-8:0
Average:         1086      0.00      0.27      0.00  java
Average:         1112      0.00      0.40      0.00  jbd2/sda4-8
Average:         1129      0.00      1.33      0.00  jbd2/sdb1-8
Average:         1554    104.16      0.27      0.13  xinetd
Average:         1570    171.43      0.00      0.00  gmond
Average:         2183      0.00      1.60      0.00  java
Average:         2222      0.00      0.27      0.00  pidstat
Average:         2405    178.49      1.07      0.00  java
Average:         2446      5.33      0.00      0.00  sleep
Average:         3734   1778.49    220.45      0.00  java
Average:         3813      0.27      0.67      0.00  java
Average:         4327      0.00      0.27      0.00  java
Average:         5457      0.00      1.07      0.00  java
Average:         9677    946.52      5.19      0.67  java
Average:        11681      0.00      0.27      0.00  java
Average:        13280     50.48    299.57      0.00  downloadm_new.s
Average:        14824      0.00      0.27      0.00  java
Average:        16613      0.00      0.27      0.00  java
Average:        16698      0.00      2.13      0.00  java
Average:        17694      0.00      0.27      0.00  java
Average:        17768      0.00      0.27      0.00  java
Average:        18074      0.00      2.13      0.00  java
Average:        18129      0.00      0.27      0.00  java
Average:        19295    498.04      0.67      0.00  java
Average:        19426      0.00      0.53      0.00  java
Average:        19542      0.00      0.27      0.00  java

@kworr

这是我的/ dev / sdb1的挂载选项.

% mount | grep sdb
/dev/sdb1 on /backup type ext4 (rw,noatime,commit=100)

UPDATE2
您的硬盘需要多少IOPS.

7,200 rpm SATA drives   HDD     ~75-100 IOPS[2]     SATA 3 Gb/s     
10,000 rpm SATA drives  HDD     ~125-150 IOPS[2]    SATA 3 Gbit/s   
10,000 rpm SAS drives   HDD     ~140 IOPS[2]            SAS     
15,000 rpm SAS drives   HDD     ~175-210 IOPS[2]    SAS

解决方法

这里发生了很多事情,但是pid 3734,一个java进程似乎是你的罪魁祸首.你应该知道它在做什么,传递给它的参数是什么,它的父pid是什么,以及它的意图.

在超过30秒的1秒样本期间,java使用1778.49读取kb / sec,还有其他java进程,pids 9677和19295分别使用946.52和498.04读取kb / sec.

我无法告诉你他们所做的事情是错还是​​正确,但是你的I / O很高,主要是因为那些java进程.

相关文章

1、安装Apache。 1)执行如下命令,安装Apache服务及其扩展包...
一、先说一下用ansible批量采集机器信息的实现办法: 1、先把...
安装配置 1. 安装vsftpd 检查是否安装了vsftpd # rpm -qa | ...
如何抑制stable_secret读取关键的“net.ipv6.conf.all.stabl...
1 删除0字节文件 find -type f -size 0 -exec rm -rf {} ...
## 步骤 1:安装必要的软件包 首先,需要确保系统已安装 `dh...