[one-users] ONE 2.0 underreporting running VM's in onehost list
Fabian Wenk
fabian at wenks.ch
Tue Sep 27 07:14:01 PDT 2011
Hello
On 26.09.2011 13:44, Carlos Martín Sánchez wrote:
> The host_shares contains the "running_vms" column; you need to update that
> column value with OpenNebula stopped.
>
> We are still trying to figure out what causes this bug, so if you come
> across it again, it would be great if you could write down the operations
> that led to it.
I do not know if this is related or not, but I guess it could be
an indication.
I am running OpenNebula 2.2.1 with MySQL database. I did just
restart mysqld and now all the one* commands report errors like this:
# onevm list
[VirtualMachinePoolInfo] Error getting VM Pool.
In oned.log I see the following messages (regarding the 'onevm
list' command):
Tue Sep 27 13:47:20 2011 [ReM][D]: VirtualMachinePoolInfo method
invoked
Tue Sep 27 13:47:20 2011 [ONE][E]: SQL command was: SELECT
vm_pool.oid, vm_pool.uid, vm_pool.name, vm_pool.last_poll,
vm_pool.state, vm_pool.lcm_state, vm_pool.stime, vm_pool.etime,
vm_pool.deploy_id, vm_pool.memory, vm_pool.cpu, vm_pool.net_tx,
vm_pool.net_rx, vm_pool.last_seq, vm_pool.template,
user_pool.user_name, history.vid, history.seq, history.host_name,
history.vm_dir, history.hid, history.vm_mad, history.tm_mad,
history.stime, history.etime, history.pstime, history.petime,
history.rstime, history.retime, history.estime, history.eetime,
history.reason FROM vm_pool LEFT OUTER JOIN history ON
vm_pool.oid = history.vid AND history.seq = vm_pool.last_seq LEFT
OUTER JOIN (SELECT oid,user_name FROM user_pool) AS user_pool ON
vm_pool.uid = user_pool.oid WHERE vm_pool.state <> 6, error 2006
: MySQL server has gone away
Tue Sep 27 13:47:20 2011 [ReM][E]: [VirtualMachinePoolInfo] Error
getting VM Pool.
And some other general messages, probably from monitoring:
Tue Sep 27 13:47:13 2011 [ONE][E]: SQL command was: SELECT oid,
im_mad FROM host_pool WHERE state != 4 ORDER BY last_mon_time ASC
LIMIT 15, error 2006 : MySQL server has gone away
Tue Sep 27 13:47:13 2011 [ONE][E]: SQL command was: SELECT oid
FROM vm_pool WHERE last_poll <= 1317130633 and state = 3 and (
lcm_state = 3 or lcm_state = 16 ) ORDER BY last_poll ASC LIMIT 5,
error 2006 : MySQL server has gone away
For some reason oned does not re-connect to the MySQL server. I
do not know how this is implemented (or if this is something
which depends on my system), but I think if the mysql library is
used, the reconnect should be automatically and transparently. A
still running mysql client after the restart of mysqld does
handle this just fine and transparently (with just an
informational message):
mysql> show databases;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 1
Current database: *** NONE ***
+--------------------+
| Database |
+--------------------+
| information_schema |
After also restarting OpenNebula (oned, scheduler), everything
seems to work fine again. But I guess, if for some reason mysqld
is down (or is going done) at the wrong moment, the database
could not have saved all the needed information. Eg. in the
moment when scheduler is deploying a VM to a cluster node. Could
something like this cause the reporting errors Steve is seeing?
bye
Fabian
More information about the Users
mailing list