[one-users] Problems when booting VM (OpenNebula 2.0.1 and ESXi 4.1)

Luigi Fortunati luigi.fortunati at gmail.com
Thu Feb 3 02:08:16 PST 2011


Hi Matthias,
First of all, thanks for the reply.
I checked on the logs that come up in the "var" folder when the
"ONE_MAD_DEBUG" option is set to 1 in $ONE_LOCATION/etc/defaultrc file
(enabling logging for all the vmware drivers) but the content seems a bit
cryptic and unhelpful.

Anyhow I managed to solve the problem regarding the launch of a new VM. I
realized that the cluster hosts works with root users when managing files on
the nfs repository. Given the settings of my NFS server I noticed that root
squashing was enabled so the root user of the cluster hosts was mapped as
user with UID 65534(user "nobody") on the opennebula frontend host.
Therefore the root user of the cluster hosts couldn't access files that were
placed on the shared image repository because there weren't enough
permissions.
I decided to solve this problem by modifying the /etx/exports file one the
opennebula frontend

/srv/cloud/one/var  <ip address/mask>(rw,all_squash_,anonuid=<id of
'oneadmin' user>, anongid=<id og 'cloud' group>)

with these settings all users of the cluster hosts are mapped to
oneadmin:cloud on the frontend.

Up to now I can start a VM that uses nonpersistent images, but I can't start
VM with persistent images.
This is the output of vm.log:
Wed Feb  2 15:51:54 2011 [DiM][I]: New VM state is ACTIVE.
Wed Feb  2 15:51:54 2011 [LCM][I]: New VM state is PROLOG.
Wed Feb  2 15:51:54 2011 [VM][I]: Virtual Machine has no context
Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Creating directory
/srv/cloud/one/var/49/images
Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "mkdir -p
/srv/cloud/one/var/49/images".
Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "chmod a+w
/srv/cloud/one/var/49/images".
Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Link
/srv/cloud/one/var/images/7594aeafecbabec0de9da508cf2500fb486675a6
Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "ln -s
../../images/7594aeafecbabec0de9da508cf2500fb486675a6 /srv/cloud/
one/var/49/images/disk.0".
Wed Feb  2 15:51:54 2011 [LCM][I]: New VM state is BOOT
Wed Feb  2 15:51:54 2011 [VMM][I]: Generating deployment file:
/srv/cloud/one/var/49/deployment.0
Wed Feb  2 15:51:54 2011 [VMM][I]: Command execution fail:
/srv/cloud/one/lib/remotes/vmm/vmware/deploy custom6.sns.it /srv/c
loud/one/var/49/deployment.0
Wed Feb  2 15:51:54 2011 [VMM][I]: STDERR follows.
Wed Feb  2 15:51:54 2011 [VMM][I]: [VMWARE] cmd failed
[/srv/cloud/one/bin/tty_expect -u oneadmin -p password1234 virsh -c esx://
custom6.sns.it?no_verify=1 define /srv/cloud/one/var/49/deployment.0].
Stderr:
Wed Feb  2 15:51:54 2011 [VMM][I]: error: Failed to define domain from
/srv/cloud/one/var/49/deployment.0
Wed Feb  2 15:51:54 2011 [VMM][I]: error: internal error HTTP response code
500 for upload to '
https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images
'
Wed Feb  2 15:51:54 2011 [VMM][I]:
Wed Feb  2 15:51:54 2011 [VMM][I]: . Stdout: ExitCode: 1
Wed Feb  2 15:51:54 2011 [VMM][I]: ExitCode: 1
Wed Feb  2 15:51:54 2011 [VMM][E]: Error deploying virtual machine
Wed Feb  2 15:51:54 2011 [DiM][I]: New VM state is FAILED
Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: LOG - 49 tm_delete.sh: Deleting
/srv/cloud/one/var/49/images

Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: LOG - 49 tm_delete.sh: Executed
"rm -rf /srv/cloud/one/var/49/images".

Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: TRANSFER SUCCESS 49 -

Wed Feb  2 16:38:54 2011 [DiM][I]: New VM state is DONE.
Wed Feb  2 16:38:54 2011 [HKM][I]: Command execution fail:
/srv/cloud/one/share/hooks/image.rb 49
Wed Feb  2 16:38:54 2011 [HKM][I]: STDERR follows.
Wed Feb  2 16:38:54 2011 [HKM][I]: ExitCode: 255
Wed Feb  2 16:38:54 2011 [HKM][E]: Error executing Hook: image.

Given that openebula creates a soft link in directory
$ONE_LOCATION/var/49/images/ pointing to the dir of the persistent image and
then writes in the deployment.0 file (the xml file in $ONE_LOCATION/var/49/)
the path of the vmdk disk (<source file='[images]
49/images/disk.0/disk.vmdk'/>), the use of a soft link seems to be the
problem.
I tried to change the path to the vmdk file in the deployment.0 file like
this:
<source file='[images]
images/7594aeafecbabec0de9da508cf2500fb486675a6/disk.vmdk'/>
so that now it points to the disk directly, withouth passing through the
soft link...

and then launching the virsh command that define the new domain on the
cluster host:
/srv/cloud/one/bin/tty_expect -u oneadmin -p custom2011 virsh -c esx://
custom6.sns.it?no_verify=1 define /srv/cloud/one/var/49/deployment.0
This last attempt resolved correctly.

Why can't it work with symlinks?

Moreover, given this log entry from vm.log:
Wed Feb  2 15:51:54 2011 [VMM][I]: error: internal error HTTP response code
500 for upload to '
https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images
'

What does it mean?
If we replace the ascii code the path it's like this:
https://custom6.sns.it:443/folder/49/images/disk.0/one-49.vmx?dcPath=ha-datacenter&dsName=images
Does it mean that it can't upload the vmx file to the given folder?
This problem is probably more related to libvirt. I found good information
about libvirt and esx here:
http://libvirt.org/drvesx.html (this may helps you as it has for me)

Can someone post a deployment.0 file of a vm correctly launched either with
persistent and nonpersistent image?

P.S.: Another thing I noticed is that the vm doesn't get scheduled on a
cluster host if you don't set the number of vcpu on the vm template file!

On Tue, Feb 1, 2011 at 2:37 PM, Matthias Keller <mkeller at upb.de> wrote:

> Dear Luigi,
>
> referring to your last post today (01.02), I might help. We are in the same
> situation as we also trying to build up an Opennebula-Cloud with
> ESXi-Servers. After some headache about my low administration skills and
> about a bit unclear documentation on the website / configuration process, we
> figured out to launch VMs on esx-servers. As I'm not able to diagnose your
> described issues, I try to summaries our experiences:
>
> - we had to reinstall our opennebula 2 installment, because system-wide
> installation /etc/oned.conf, etc. didn't work fully, because some scripts
> and especially scripts of the esx-drivers plugin. So our second try ends -
> as your installment - in /srv/cloud/one
> - after new logging in ONE_LOCATION env was not set correctly, so we had to
> make sure it's written somewhere at .bashrc and changing user from root by
> using "su - oneadmin", this initializes the oneadmin-shell environment
> - first we tried starting a vm without network to minimize the
> error-sources.
> - we installed (in documentation mentioned) script-fix with sudo rights -
> this is need to change owner, so the oneadmin of esx is able to access it,
> in order to start your vm - I guess somewhere near "starting a vm" can your
> problem be solved.
> - the [image] esx volume should be the correct one (DATASTORE) and can be
> checked via vsphere client. the oneadmin has to have the same uid in esx and
> linux. We also checked this, by starting a placed VM by hand via vsphere
> client.
> - describing Arch=i686 is necessarily (as you did).
> - activate logging for VMM-Driver (because I assume your TM-Driver works
> properly):
> if your oned.conf looks like:
> #  VMware Driver Addon Virtualization Driver Manager Configuration
>
> #-------------------------------------------------------------------------------
> VM_MAD = [
>    name       = "vmm_vmware",
>    executable = "one_vmm_sh",
>    arguments  = "vmware",
>    default    = "vmm_sh/vmm_sh_vmware.conf",
>    type       = "vmware" ]
>
> than the file should contains the following input:
> file:/srv/cloud/one/etc/vmm_sh/vmm_sh_vmwarerc
> # Uncomment the following line to active MAD debug
> ONE_MAD_DEBUG="1"
>
> For every driver it should work with editing the following file - but I'm
> not really sure about that:
> /srv/cloud/one/etc/defaultrc
>
> Logs are placed in /srv/cloud/one/var
>
> Perhaps I could helped you,
>
> Matthias Keller
>
>
>
>
>


-- 
Luigi Fortunati
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20110203/d722fc60/attachment-0003.htm>


More information about the Users mailing list