[one-users] a new failure mode
yocum at fnal.gov
Mon Aug 15 14:08:33 PDT 2011
We've encountered a new "failure" mode which we don't know how to
recover from. Help!
libvirtd daemon dies on a host node. oned can't successfully query the
libvirtd daemon as to the state of the VMs, so all VMs enter "unknown"
state. User doesn't realize that libvirtd is the problem, so they
attempt to 'onevm restart <vmid>' which results in a state of 'fail.'
Sysadmin comes along and restarts libvirtd on the host machine and all
VMs are now visible and oned can successfully query the state of the
VMs, however, oned still thinks that the other VMs are in 'fail' state
so the user can't restart or stop or anything to the VMs (which are
Is there a way to force oned to rescan all VMs, even if they're in a
yocum at fnal.gov, http://fermigrid.fnal.gov
"I fly because it releases my mind from the tyranny of petty things."
More information about the Users