[one-users] kvm stack traces with hi I/O load

Mon Aug 13 23:42:05 PDT 2012

Hi Guys

I have a new KVM server, running software raid (mdadm) and the VM disk 
are help in a raid 5 with 5 disks (the system is on SSDs in a mirror).

So far I have about 10 VM's setup, but they are all unable to function 
because after we have a few up, and then start to deploy/resubmit the 
VM's which have never booted properly the disk IO will stop, the scp 
process will hang and it all stops. You will then find the following 
error in dmesg:

[ 1201.890311] INFO: task kworker/1:1:6185 blocked for more than 120 
seconds.
[ 1201.890430] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 1201.890569] kworker/1:1     D ffff88203fc13740     0  6185      2 
0x00000000
[ 1201.890573]  ffff881ffe510140 0000000000000046 0000000000000000 
ffff881039023590
[ 1201.890580]  0000000000013740 ffff8820393fffd8 ffff8820393fffd8 
ffff881ffe510140
[ 1201.890586]  0000000000000000 0000000100000000 0000000000000001 
7fffffffffffffff
[ 1201.890593] Call Trace:
[ 1201.890597]  [<ffffffff81349d2e>] ? schedule_timeout+0x2c/0xdb
[ 1201.890605]  [<ffffffff810ebdbf>] ? kmem_cache_alloc+0x86/0xea
[ 1201.890610]  [<ffffffff8134a58a>] ? __down_common+0x9b/0xee
[ 1201.890631]  [<ffffffffa0452c57>] ? xfs_getsb+0x28/0x3b [xfs]
[ 1201.890635]  [<ffffffff81063111>] ? down+0x25/0x34
[ 1201.890648]  [<ffffffffa041566f>] ? xfs_buf_lock+0x65/0x9d [xfs]
[ 1201.890665]  [<ffffffffa0452c57>] ? xfs_getsb+0x28/0x3b [xfs]
[ 1201.890685]  [<ffffffffa045b957>] ? xfs_trans_getsb+0x64/0xb4 [xfs]
[ 1201.890704]  [<ffffffffa0452a40>] ? xfs_mod_sb+0x21/0x77 [xfs]
[ 1201.890720]  [<ffffffffa0422736>] ? xfs_reclaim_inode+0x22d/0x22d [xfs]
[ 1201.890734]  [<ffffffffa041a43e>] ? xfs_fs_log_dummy+0x61/0x75 [xfs]
[ 1201.890754]  [<ffffffffa04573a7>] ? xfs_log_need_covered+0x4d/0x8d [xfs]
[ 1201.890769]  [<ffffffffa0422770>] ? xfs_sync_worker+0x3a/0x6a [xfs]
[ 1201.890773]  [<ffffffff8105aeaa>] ? process_one_work+0x163/0x284
[ 1201.890778]  [<ffffffff8105be72>] ? worker_thread+0xc2/0x145
[ 1201.890782]  [<ffffffff8105bdb0>] ? manage_workers.isra.23+0x15b/0x15b
[ 1201.890787]  [<ffffffff8105efad>] ? kthread+0x76/0x7e
[ 1201.890794]  [<ffffffff81351cf4>] ? kernel_thread_helper+0x4/0x10
[ 1201.890799]  [<ffffffff8105ef37>] ? kthread_worker_fn+0x139/0x139
[ 1201.890804]  [<ffffffff81351cf0>] ? gs_change+0x13/0x13

and lots of them. With this stack track the CPU load will just increase 
and I have to power cycle it to get the system back. I have added the 
following sysctls:

fs.file-max = 262144
kernel.pid_max = 262144
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 87380 8388608
net.core.rmem_max = 25165824
net.core.rmem_default = 25165824
net.core.wmem_max = 25165824
net.core.wmem_default = 131072
net.core.netdev_max_backlog = 8192
net.ipv4.tcp_window_scaling = 1
net.core.optmem_max = 25165824
net.core.somaxconn = 65536
net.ipv4.ip_local_port_range = 1024 65535
kernel.shmmax = 4294967296
vm.max_map_count = 262144

but the import part I found out was:
#http://blog.ronnyegner-consulting.de/2011/10/13/info-task-blocked-for-more-than-120-seconds/
vm.dirty_ratio=10

which does not seem to help thou.

Now some info on the disk:
#mount
/dev/md2 on /data type xfs 
(rw,noatime,attr2,delaylog,sunit=1024,swidth=4096,noquota)

cat /proc/meminfo
MemTotal:       132259720 kB
MemFree:        122111692 kB

cat /proc/cpuinfo (32 v cores)
processor    : 31
vendor_id    : GenuineIntel
cpu family    : 6
model        : 45
model name    : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz

Some info the Host:
# cat /etc/debian_version
wheezy/sid
# uname -a
Linux chaos 3.2.0-3-amd64 #1 SMP Thu Jun 28 09:07:26 UTC 2012 x86_64 
GNU/Linux
ii  kvm 1:1.1.0+dfsg-3                       dummy transitional package 
from kvm to qemu-kvm
ii  qemu-kvm 1.1.0+dfsg-3                         Full virtualization on 
x86 hardware
ii  libvirt-bin 0.9.12-4                             programs for the 
libvirt library
ii  libvirt0 0.9.12-4                             library for 
interfacing with different virtualization systems
ii  python-libvirt 0.9.12-4                             libvirt Python 
bindings
ii  opennebula 3.4.1-3+b1                           controller which 
executes the OpenNebula cluster services
ii  opennebula-common 3.4.1-3                              empty package 
to create OpenNebula users and directories
ii  opennebula-sunstone 3.4.1-3                              web 
interface to which executes the OpenNebula cluster services
ii  opennebula-tools 3.4.1-3                              Command-line 
tools for OpenNebula Cloud
ii  ruby-opennebula 3.4.1-3                              Ruby bindings 
for OpenNebula Cloud API (OCA)

Any ideas on how to get this working right now the server is a lemon! :0

-- 
Jurgen Weber

Systems Engineer
IT Infrastructure Team Leader

THE ICONIC | E jurgen.weber at theiconic.com.au | www.theiconic.com.au