[one-users] Monitoring of a host can fail during a VM shutdown

Javier Fontan jfontan at opennebula.org
Thu Jan 23 01:53:25 PST 2014


Here it is http://dev.opennebula.org/issues/2630

On Wed, Jan 22, 2014 at 10:56 PM, Nicolas Bélan <nicolas.belan at gmail.com> wrote:
> Hello,
>
> got something similar on one 4.2:
> Wed Jan 22 13:16:09 2014 [InM][I]: Monitoring host gimli02 (1)
> Wed Jan 22 13:16:14 2014 [InM][I]: Command execution fail: 'if [ -x
> "/var/tmp/one/im/run_probes" ]; then /var/t
> mp/one/im/run_probes kvm 1 gimli02; else
> exit 42; fi'
> Wed Jan 22 13:16:14 2014 [InM][I]: error: failed to get domain 'one-305'
> Wed Jan 22 13:16:14 2014 [InM][I]: error: Domaine non trouvé : no domain
> with matching name 'one-305'
> Wed Jan 22 13:16:14 2014 [InM][I]: ../../vmm/kvm/poll:70:in
> `get_vm_info': undefined method `[]' for nil:NilCla
> ss (NoMethodError)
> Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:68:in `each'
> Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:68:in
> `get_vm_info'
> Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:35:in
> `get_all_vm_info'
> Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:401:in
> `print_all_vm_template'
> Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:440
> Wed Jan 22 13:16:14 2014 [InM][E]: Error executing poll.sh
> Wed Jan 22 13:16:14 2014 [InM][I]: ExitCode: 1
> Wed Jan 22 13:16:14 2014 [ONE][E]: Error monitoring Host gimli02 (1):
> Error executing poll.sh
>
> The problem appeares while deploying and removing 4 VMs in oneFlow. So,
> It looks like the problem of Daniel.
>
> The problem raise a full VM reset of the running vms on the HV.
>
> Do you have any bug id linked for this PR ?
>
> Thanks,
> Nicolas.
>
> Le 08/01/2014 11:17, Javier Fontan a écrit :
>> This is a bug indeed. It can also fail in other cases like VM crash or
>> migration.
>>
>> Open that bug and we will look into it. When you see those crashes it
>> is safe to open a bug.
>>
>> On Wed, Dec 11, 2013 at 3:30 PM, Daniel Dehennin
>> <daniel.dehennin at baby-gnu.org> wrote:
>>> Hello,
>>>
>>> On a ONE 4.2, we just encountered an transiant issue:
>>>
>>>     [InM][I]: Monitoring host grichka (9)
>>>     [InM][I]: Command execution fail: 'if [ -x "/var/tmp/one/im/run_probes" ]; then /var/tmp/one/im/run_probes kvm 9 grichka; else exit 42; fi'
>>>     [InM][I]: error: failed to get domain 'one-1547'
>>>     [InM][I]: error: Domain not found: no domain with matching name 'one-1547'
>>>     [InM][I]: ../../vmm/kvm/poll:70:in `block in get_vm_info': undefined method `[]' for nil:NilClass (NoMethodError)
>>>     [InM][I]: from ../../vmm/kvm/poll:68:in `each'
>>>     [InM][I]: from ../../vmm/kvm/poll:68:in `get_vm_info'
>>>     [InM][I]: from ../../vmm/kvm/poll:35:in `get_all_vm_info'
>>>     [InM][I]: from ../../vmm/kvm/poll:401:in `print_all_vm_template'
>>>     [InM][I]: from ../../vmm/kvm/poll:440:in `<main>'
>>>     [InM][E]: Error executing poll.sh
>>>     [InM][I]: ExitCode: 1
>>>     [ONE][E]: Error monitoring Host grichka (9): Error executing poll.sh
>>>
>>>
>>> It looks like something list all the running VM and then call
>>> “get_vm_info” for each one.
>>>
>>> This result in an error if the VM disappears in the meantime, because of
>>> a shutdown for example.
>>>
>>> Is this something plausible, in which case I'll open an issue on the bug
>>> tracker?
>>>
>>> Regards.
>>>
>>> --
>>> Daniel Dehennin
>>> Récupérer ma clef GPG:
>>> gpg --keyserver pgp.mit.edu --recv-keys 0x7A6FE2DF
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.opennebula.org
>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>
>>
>>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org



-- 
Javier Fontán Muiños
Developer
OpenNebula - The Open Source Toolkit for Data Center Virtualization
www.OpenNebula.org | @OpenNebula | github.com/jfontan



More information about the Users mailing list