[one-users] Monitoring of a host can fail during a VM shutdown

Nicolas Bélan nicolas.belan at gmail.com
Wed Jan 22 13:56:26 PST 2014


Hello,

got something similar on one 4.2:
Wed Jan 22 13:16:09 2014 [InM][I]: Monitoring host gimli02 (1)
Wed Jan 22 13:16:14 2014 [InM][I]: Command execution fail: 'if [ -x
"/var/tmp/one/im/run_probes" ]; then /var/t
mp/one/im/run_probes kvm 1 gimli02; else                             
exit 42; fi'
Wed Jan 22 13:16:14 2014 [InM][I]: error: failed to get domain 'one-305'
Wed Jan 22 13:16:14 2014 [InM][I]: error: Domaine non trouvé : no domain
with matching name 'one-305'
Wed Jan 22 13:16:14 2014 [InM][I]: ../../vmm/kvm/poll:70:in
`get_vm_info': undefined method `[]' for nil:NilCla
ss (NoMethodError)
Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:68:in `each'
Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:68:in
`get_vm_info'
Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:35:in
`get_all_vm_info'
Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:401:in
`print_all_vm_template'
Wed Jan 22 13:16:14 2014 [InM][I]: from ../../vmm/kvm/poll:440
Wed Jan 22 13:16:14 2014 [InM][E]: Error executing poll.sh
Wed Jan 22 13:16:14 2014 [InM][I]: ExitCode: 1
Wed Jan 22 13:16:14 2014 [ONE][E]: Error monitoring Host gimli02 (1):
Error executing poll.sh

The problem appeares while deploying and removing 4 VMs in oneFlow. So,
It looks like the problem of Daniel.

The problem raise a full VM reset of the running vms on the HV.

Do you have any bug id linked for this PR ?

Thanks,
Nicolas.

Le 08/01/2014 11:17, Javier Fontan a écrit :
> This is a bug indeed. It can also fail in other cases like VM crash or
> migration.
>
> Open that bug and we will look into it. When you see those crashes it
> is safe to open a bug.
>
> On Wed, Dec 11, 2013 at 3:30 PM, Daniel Dehennin
> <daniel.dehennin at baby-gnu.org> wrote:
>> Hello,
>>
>> On a ONE 4.2, we just encountered an transiant issue:
>>
>>     [InM][I]: Monitoring host grichka (9)
>>     [InM][I]: Command execution fail: 'if [ -x "/var/tmp/one/im/run_probes" ]; then /var/tmp/one/im/run_probes kvm 9 grichka; else exit 42; fi'
>>     [InM][I]: error: failed to get domain 'one-1547'
>>     [InM][I]: error: Domain not found: no domain with matching name 'one-1547'
>>     [InM][I]: ../../vmm/kvm/poll:70:in `block in get_vm_info': undefined method `[]' for nil:NilClass (NoMethodError)
>>     [InM][I]: from ../../vmm/kvm/poll:68:in `each'
>>     [InM][I]: from ../../vmm/kvm/poll:68:in `get_vm_info'
>>     [InM][I]: from ../../vmm/kvm/poll:35:in `get_all_vm_info'
>>     [InM][I]: from ../../vmm/kvm/poll:401:in `print_all_vm_template'
>>     [InM][I]: from ../../vmm/kvm/poll:440:in `<main>'
>>     [InM][E]: Error executing poll.sh
>>     [InM][I]: ExitCode: 1
>>     [ONE][E]: Error monitoring Host grichka (9): Error executing poll.sh
>>
>>
>> It looks like something list all the running VM and then call
>> “get_vm_info” for each one.
>>
>> This result in an error if the VM disappears in the meantime, because of
>> a shutdown for example.
>>
>> Is this something plausible, in which case I'll open an issue on the bug
>> tracker?
>>
>> Regards.
>>
>> --
>> Daniel Dehennin
>> Récupérer ma clef GPG:
>> gpg --keyserver pgp.mit.edu --recv-keys 0x7A6FE2DF
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>
>




More information about the Users mailing list