[one-users] ONE 2.0 underreporting running VM's in onehost list

Fabian Wenk fabian at wenks.ch
Tue Sep 27 07:14:01 PDT 2011


On 26.09.2011 13:44, Carlos Martín Sánchez wrote:
> The host_shares contains the "running_vms" column; you need to update that
> column value with OpenNebula stopped.
> We are still trying to figure out what causes this bug, so if you come
> across it again, it would be great if you could write down the operations
> that led to it.

I do not know if this is related or not, but I guess it could be 
an indication.

I am running OpenNebula 2.2.1 with MySQL database. I did just 
restart mysqld and now all the one* commands report errors like this:

# onevm list
[VirtualMachinePoolInfo] Error getting VM Pool.

In oned.log I see the following messages (regarding the 'onevm 
list' command):

Tue Sep 27 13:47:20 2011 [ReM][D]: VirtualMachinePoolInfo method 
Tue Sep 27 13:47:20 2011 [ONE][E]: SQL command was: SELECT 
vm_pool.oid, vm_pool.uid, vm_pool.name, vm_pool.last_poll, 
vm_pool.state, vm_pool.lcm_state, vm_pool.stime, vm_pool.etime, 
vm_pool.deploy_id, vm_pool.memory, vm_pool.cpu, vm_pool.net_tx, 
vm_pool.net_rx, vm_pool.last_seq, vm_pool.template, 
user_pool.user_name, history.vid, history.seq, history.host_name, 
history.vm_dir, history.hid, history.vm_mad, history.tm_mad, 
history.stime, history.etime, history.pstime, history.petime, 
history.rstime, history.retime, history.estime, history.eetime, 
history.reason FROM vm_pool LEFT OUTER JOIN history ON 
vm_pool.oid = history.vid AND history.seq = vm_pool.last_seq LEFT 
OUTER JOIN (SELECT oid,user_name FROM user_pool) AS user_pool ON 
vm_pool.uid = user_pool.oid WHERE vm_pool.state <> 6, error 2006 
: MySQL server has gone away
Tue Sep 27 13:47:20 2011 [ReM][E]: [VirtualMachinePoolInfo] Error 
getting VM Pool.

And some other general messages, probably from monitoring:

Tue Sep 27 13:47:13 2011 [ONE][E]: SQL command was: SELECT oid, 
im_mad FROM host_pool WHERE state != 4 ORDER BY last_mon_time ASC 
LIMIT 15, error 2006 : MySQL server has gone away
Tue Sep 27 13:47:13 2011 [ONE][E]: SQL command was: SELECT oid 
FROM vm_pool WHERE last_poll <= 1317130633 and state = 3 and ( 
lcm_state = 3 or lcm_state = 16 ) ORDER BY last_poll ASC LIMIT 5, 
error 2006 : MySQL server has gone away

For some reason oned does not re-connect to the MySQL server. I 
do not know how this is implemented (or if this is something 
which depends on my system), but I think if the mysql library is 
used, the reconnect should be automatically and transparently. A 
still running mysql client after the restart of mysqld does 
handle this just fine and transparently (with just an 
informational message):

mysql> show databases;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    1
Current database: *** NONE ***

| Database           |
| information_schema |

After also restarting OpenNebula (oned, scheduler), everything 
seems to work fine again. But I guess, if for some reason mysqld 
is down (or is going done) at the wrong moment, the database 
could not have saved all the needed information. Eg. in the 
moment when scheduler is deploying a VM to a cluster node. Could 
something like this cause the reporting errors Steve is seeing?


More information about the Users mailing list