[one-users] OpenNebula 4.6.0 monitoring question

Steven Timm timm at fnal.gov
Wed Jul 30 06:53:50 PDT 2014


On Wed, 30 Jul 2014, Ruben S. Montero wrote:

> 
> Not really sure what can be going on...  The monitor scripts return the information of all VMs running in the node.  In 4.6 the
> monitoring system uses a push approach,  through UDP,  so you may have the information being reported by misbehaved monitoring
> daemons.  Sometimes this may happen in dev environments if you are resetting the DB,... 

when we ran the update to take this database from ONE4.4 to ONE4.6, one 
host (the aforementioned fgtest14) and one datastore (image store 101) got
wiped out of the database, I reinserted them both back in and restarted 
opennebula.

Steve Timm




> 
> On Jul 28, 2014 6:32 PM, "Steven Timm" <timm at fnal.gov> wrote:
>
>       I am currently dealing with an unexplained monitoring question
>       in OpenNebula 4.6 on my development cloud.
>
>       I frequently see OpenNebula return that the status of a ONe
>       host is "ON" even in the case of a system misconfiguration where,
>       given the credentials, it is impossible for opennebula to
>       even ssh into the node as oneadmin.
> 
>
>       I've fixed all those instances, restarted OpenNebula,
>       but opennebula still reports a number of VM's
>       in state "running" even though the node they are running
>       on was rebooted three days ago and is running no
>       virtual machines whatsoever.
>
>       I think I could be dealing with database corruption of some type
>       (generated on the one4.4->one4.6 update), or there could
>       be some problem with the remote scripts on the nodes.
>       I saw, and I think I fixed, the problems with the database
>       corruption (namely one of the hosts and one of the datastores
>       got knocked out of the database for reasons unknown, and I
>       re-inserted them).   But in any case there is some
>       error handling that is not working in the monitoring
>       and something is exiting with status 0 that shouldn't be.
>
>       ideas?  Has anyone else seen something like this?
>
>       Steve Timm
> 
> 
>
>       ------------------------------------------------------------------
>       Steven C. Timm, Ph.D  (630) 840-8525
>       timm at fnal.gov  http://home.fnal.gov/~timm/
>       Fermilab Scientific Computing Division, Scientific Computing Services Quad.
>       Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing
>       _______________________________________________
>       Users mailing list
>       Users at lists.opennebula.org
>       http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> 
> 
>

------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm at fnal.gov  http://home.fnal.gov/~timm/
Fermilab Scientific Computing Division, Scientific Computing Services Quad.
Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing


More information about the Users mailing list