[one-users] fault tolerance

Gareth de Vaux opennebula at lordcow.org
Tue Sep 18 08:27:33 PDT 2012


On Fri 2012-09-14 (12:14), Tino Vazquez wrote:
> It looks like there was a problem with the VM after it was put in
> PENDING. Could you please send us the log of the individual VM
> (/var/lib/one/<vid>/vm.log) to take a look?

Attaching at the end of the mail but it mostly looks like the oned.log.

> It should be possible to resubmit a VM in failed state, what is the
> error message you get?

Sorry I think I tried to 'restart' it - I've successfully resubmitted it now.

> Regarding the problem with "onehost sync", does oneadmin have writting
> permissions over /var/lib/one/remotes?

Nope it was owned by root which I figured was normal for /var, forgetting
this's where one's $HOME was. Fixed now thanx. For the record debian seems
to set the ownership of $HOME correctly except for ~/remotes and
~/datastores.

So, I'm able to resubmit manually but the hook still doesn't work.


10.log:

Wed Sep 12 17:54:13 2012 [VMM][D]: Monitor Information:
	CPU   : 6
	Memory: 524288
	Net_TX: 4761
	Net_RX: 49874
Wed Sep 12 17:56:27 2012 [VMM][I]: Command execution fail: 'if [ -x "/var/tmp/one/vmm/kvm/poll" ]; then /var/tmp/one/vmm/kvm/poll one-10 arcus 10 arcus; else                              exit 42; fi'
Wed Sep 12 17:56:27 2012 [VMM][I]: ssh: connect to host arcus port 22: Connection timed out
Wed Sep 12 17:56:27 2012 [VMM][I]: ExitCode: 255
Wed Sep 12 17:56:27 2012 [VMM][E]: Error monitoring VM
Wed Sep 12 17:56:27 2012 [VMM][E]: Error monitoring VM
Wed Sep 12 17:57:29 2012 [DiM][I]: New VM state is PENDING
Wed Sep 12 17:57:29 2012 [TM][W]: Ignored: LOG I 10 ExitCode: 0

Wed Sep 12 17:57:29 2012 [VMM][W]: Ignored: LOG I 10 Driver command for 10 cancelled

Wed Sep 12 17:57:37 2012 [DiM][I]: New VM state is ACTIVE.
Wed Sep 12 17:57:38 2012 [LCM][I]: New VM state is PROLOG.
Wed Sep 12 17:57:38 2012 [VM][I]: Virtual Machine has no context
Wed Sep 12 17:57:38 2012 [TM][I]: Command execution fail: /var/lib/one/remotes/tm/shared/ln cirrus:/var/lib/one/datastores/1/aab3c5409d45f015626af354c827a776 nimbus:/var/lib/one//datastores/0/10/disk.0
Wed Sep 12 17:57:38 2012 [TM][I]: ln: Linking ../../1/aab3c5409d45f015626af354c827a776 in nimbus:/var/lib/one//datastores/0/10/disk.0
Wed Sep 12 17:57:38 2012 [TM][E]: ln: Command "cd /var/lib/one/datastores/0/10; ln -s ../../1/aab3c5409d45f015626af354c827a776 /var/lib/one/datastores/0/10/disk.0" failed: ln: failed to create symbolic link `/var/lib/one/datastores/0/10/disk.0': File exists
Wed Sep 12 17:57:38 2012 [TM][E]: Error linking cirrus:/var/lib/one/datastores/1/aab3c5409d45f015626af354c827a776 to nimbus:/var/lib/one//datastores/0/10/disk.0
Wed Sep 12 17:57:38 2012 [TM][I]: ExitCode: 1
Wed Sep 12 17:57:38 2012 [TM][E]: Error executing image transfer script: Error linking cirrus:/var/lib/one/datastores/1/aab3c5409d45f015626af354c827a776 to nimbus:/var/lib/one//datastores/0/10/disk.0
Wed Sep 12 17:57:38 2012 [DiM][I]: New VM state is FAILED
Wed Sep 12 17:58:32 2012 [TM][W]: Ignored: LOG I 10 Command execution fail: /var/lib/one/remotes/tm/shared/delete arcus:/var/lib/one//datastores/0/10

Wed Sep 12 17:58:32 2012 [TM][W]: Ignored: LOG I 10 delete: Deleting /var/lib/one/datastores/0/10

Wed Sep 12 17:58:32 2012 [TM][W]: Ignored: LOG E 10 delete: Command "rm -rf /var/lib/one/datastores/0/10" failed: ssh: connect to host arcus port 22: Connection timed out

Wed Sep 12 17:58:32 2012 [TM][W]: Ignored: LOG E 10 Error deleting /var/lib/one/datastores/0/10

Wed Sep 12 17:58:32 2012 [TM][W]: Ignored: LOG I 10 ExitCode: 255

Wed Sep 12 17:58:32 2012 [TM][W]: Ignored: TRANSFER FAILURE 10 Error deleting /var/lib/one/datastores/0/10

Wed Sep 12 17:58:32 2012 [VMM][W]: Ignored: LOG I 10 Command execution fail: /var/tmp/one/vmm/kvm/cancel one-10 arcus 10 arcus

Wed Sep 12 17:58:32 2012 [VMM][W]: Ignored: LOG I 10 ssh: connect to host arcus port 22: Connection timed out

Wed Sep 12 17:58:32 2012 [VMM][W]: Ignored: LOG I 10 ExitSSHCode: 255

Wed Sep 12 17:58:32 2012 [VMM][W]: Ignored: LOG E 10 Error connecting to arcus

Wed Sep 12 17:58:32 2012 [VMM][W]: Ignored: LOG I 10 Failed to execute virtualization driver operation: cancel.

Wed Sep 12 17:58:32 2012 [VMM][W]: Ignored: CANCEL FAILURE 10 Error connecting to arcus



More information about the Users mailing list