[one-users] Problems when booting VM (OpenNebula 2.0.1 and ESXi 4.1)

Luigi Fortunati luigi.fortunati at gmail.com
Thu Feb 17 02:16:28 PST 2011


Hi,
I fixed the problem related to the use of symlink.
I found out that by making a symlink to a directory (which is what
OpenNebula does) my ESXi 4.1 server sees the directory not named as
"disk.0", but with the same name of the directory in the image repository
(which is like "44bdf63hf63kd73nfetcetc..."). You can check that out by
browsing the "images" repository with VSphere Client. I decided to change
the script vmware/tm_ln.sh in order to create a directory disk.0 in the
 $ONE_LOCATION/var/<vmID>/images/ directory and then create hardlinks to the
vmdk files in the image repo, and now persistent images are working. :-)

On Mon, Feb 7, 2011 at 5:15 PM, Tino Vazquez <tinova at fdi.ucm.es> wrote:

> Hi Luigi,
>
> There is indeed a bug for system-wide installations, we are working on
> it and will be fixed in the upcoming 2.2 release.
>
> About the symlink, we got tests for the Image catalog in VMware that
> shows it working for persistent and not persistent images. The
> template is trivial, just using a image flagged as persistent within a
> VM. Maybe the problem lies somewhere else, like for instance, are you
> using VMFS? In our tests we use NFS.
>
> About the VCPU, the scheduler completely ignores this parameter, so it
> shouldn't make a difference for scheduling.
>
> Regards,
>
> -Tino
>
> --
> Constantino Vázquez Blanco | dsa-research.org/tinova
> Virtualization Technology Engineer / Researcher
> OpenNebula Toolkit | opennebula.org
>
>
>
> On Thu, Feb 3, 2011 at 11:08 AM, Luigi Fortunati
> <luigi.fortunati at gmail.com> wrote:
> > Hi Matthias,
> > First of all, thanks for the reply.
> > I checked on the logs that come up in the "var" folder when the
> > "ONE_MAD_DEBUG" option is set to 1 in $ONE_LOCATION/etc/defaultrc file
> > (enabling logging for all the vmware drivers) but the content seems a bit
> > cryptic and unhelpful.
> > Anyhow I managed to solve the problem regarding the launch of a new VM. I
> > realized that the cluster hosts works with root users when managing files
> on
> > the nfs repository. Given the settings of my NFS server I noticed that
> root
> > squashing was enabled so the root user of the cluster hosts was mapped as
> > user with UID 65534(user "nobody") on the opennebula frontend host.
> > Therefore the root user of the cluster hosts couldn't access files that
> were
> > placed on the shared image repository because there weren't enough
> > permissions.
> > I decided to solve this problem by modifying the /etx/exports file one
> the
> > opennebula frontend
> > /srv/cloud/one/var  <ip address/mask>(rw,all_squash_,anonuid=<id of
> > 'oneadmin' user>, anongid=<id og 'cloud' group>)
> > with these settings all users of the cluster hosts are mapped to
> > oneadmin:cloud on the frontend.
> > Up to now I can start a VM that uses nonpersistent images, but I can't
> start
> > VM with persistent images.
> > This is the output of vm.log:
> > Wed Feb  2 15:51:54 2011 [DiM][I]: New VM state is ACTIVE.
> > Wed Feb  2 15:51:54 2011 [LCM][I]: New VM state is PROLOG.
> > Wed Feb  2 15:51:54 2011 [VM][I]: Virtual Machine has no context
> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Creating directory
> > /srv/cloud/one/var/49/images
> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "mkdir -p
> > /srv/cloud/one/var/49/images".
> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "chmod a+w
> > /srv/cloud/one/var/49/images".
> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Link
> > /srv/cloud/one/var/images/7594aeafecbabec0de9da508cf2500fb486675a6
> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "ln -s
> > ../../images/7594aeafecbabec0de9da508cf2500fb486675a6 /srv/cloud/
> > one/var/49/images/disk.0".
> > Wed Feb  2 15:51:54 2011 [LCM][I]: New VM state is BOOT
> > Wed Feb  2 15:51:54 2011 [VMM][I]: Generating deployment file:
> > /srv/cloud/one/var/49/deployment.0
> > Wed Feb  2 15:51:54 2011 [VMM][I]: Command execution fail:
> > /srv/cloud/one/lib/remotes/vmm/vmware/deploy custom6.sns.it /srv/c
> > loud/one/var/49/deployment.0
> > Wed Feb  2 15:51:54 2011 [VMM][I]: STDERR follows.
> > Wed Feb  2 15:51:54 2011 [VMM][I]: [VMWARE] cmd failed
> > [/srv/cloud/one/bin/tty_expect -u oneadmin -p password1234 virsh -c
> > esx://custom6.sns.it?no_verify=1 define
> /srv/cloud/one/var/49/deployment.0].
> > Stderr:
> > Wed Feb  2 15:51:54 2011 [VMM][I]: error: Failed to define domain from
> > /srv/cloud/one/var/49/deployment.0
> > Wed Feb  2 15:51:54 2011 [VMM][I]: error: internal error HTTP response
> code
> > 500 for upload to
> > '
> https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images
> '
> > Wed Feb  2 15:51:54 2011 [VMM][I]:
> > Wed Feb  2 15:51:54 2011 [VMM][I]: . Stdout: ExitCode: 1
> > Wed Feb  2 15:51:54 2011 [VMM][I]: ExitCode: 1
> > Wed Feb  2 15:51:54 2011 [VMM][E]: Error deploying virtual machine
> > Wed Feb  2 15:51:54 2011 [DiM][I]: New VM state is FAILED
> > Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: LOG - 49 tm_delete.sh:
> Deleting
> > /srv/cloud/one/var/49/images
> > Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: LOG - 49 tm_delete.sh:
> Executed
> > "rm -rf /srv/cloud/one/var/49/images".
> > Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: TRANSFER SUCCESS 49 -
> > Wed Feb  2 16:38:54 2011 [DiM][I]: New VM state is DONE.
> > Wed Feb  2 16:38:54 2011 [HKM][I]: Command execution fail:
> > /srv/cloud/one/share/hooks/image.rb 49
> > Wed Feb  2 16:38:54 2011 [HKM][I]: STDERR follows.
> > Wed Feb  2 16:38:54 2011 [HKM][I]: ExitCode: 255
> > Wed Feb  2 16:38:54 2011 [HKM][E]: Error executing Hook: image.
> > Given that openebula creates a soft link in directory
> > $ONE_LOCATION/var/49/images/ pointing to the dir of the persistent image
> and
> > then writes in the deployment.0 file (the xml file
> in $ONE_LOCATION/var/49/)
> > the path of the vmdk disk (<source file='[images]
> > 49/images/disk.0/disk.vmdk'/>), the use of a soft link seems to be the
> > problem.
> > I tried to change the path to the vmdk file in the deployment.0 file like
> > this:
> > <source file='[images]
> > images/7594aeafecbabec0de9da508cf2500fb486675a6/disk.vmdk'/>
> > so that now it points to the disk directly, withouth passing through the
> > soft link...
> > and then launching the virsh command that define the new domain on the
> > cluster host:
> > /srv/cloud/one/bin/tty_expect -u oneadmin -p custom2011 virsh -c
> > esx://custom6.sns.it?no_verify=1 define
> /srv/cloud/one/var/49/deployment.0
> > This last attempt resolved correctly.
> > Why can't it work with symlinks?
> > Moreover, given this log entry from vm.log:
> > Wed Feb  2 15:51:54 2011 [VMM][I]: error: internal error HTTP response
> code
> > 500 for upload to
> > '
> https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images
> '
> > What does it mean?
> > If we replace the ascii code the path it's like
> > this:
> https://custom6.sns.it:443/folder/49/images/disk.0/one-49.vmx?dcPath=ha-datacenter&dsName=images
> > Does it mean that it can't upload the vmx file to the given folder?
> > This problem is probably more related to libvirt. I found good
> information
> > about libvirt and esx here:
> > http://libvirt.org/drvesx.html (this may helps you as it has for me)
> > Can someone post a deployment.0 file of a vm correctly launched either
> with
> > persistent and nonpersistent image?
> > P.S.: Another thing I noticed is that the vm doesn't get scheduled on a
> > cluster host if you don't set the number of vcpu on the vm template file!
> > On Tue, Feb 1, 2011 at 2:37 PM, Matthias Keller <mkeller at upb.de> wrote:
> >>
> >> Dear Luigi,
> >>
> >> referring to your last post today (01.02), I might help. We are in the
> >> same situation as we also trying to build up an Opennebula-Cloud with
> >> ESXi-Servers. After some headache about my low administration skills and
> >> about a bit unclear documentation on the website / configuration
> process, we
> >> figured out to launch VMs on esx-servers. As I'm not able to diagnose
> your
> >> described issues, I try to summaries our experiences:
> >>
> >> - we had to reinstall our opennebula 2 installment, because system-wide
> >> installation /etc/oned.conf, etc. didn't work fully, because some
> scripts
> >> and especially scripts of the esx-drivers plugin. So our second try ends
> -
> >> as your installment - in /srv/cloud/one
> >> - after new logging in ONE_LOCATION env was not set correctly, so we had
> >> to make sure it's written somewhere at .bashrc and changing user from
> root
> >> by using "su - oneadmin", this initializes the oneadmin-shell
> environment
> >> - first we tried starting a vm without network to minimize the
> >> error-sources.
> >> - we installed (in documentation mentioned) script-fix with sudo rights
> -
> >> this is need to change owner, so the oneadmin of esx is able to access
> it,
> >> in order to start your vm - I guess somewhere near "starting a vm" can
> your
> >> problem be solved.
> >> - the [image] esx volume should be the correct one (DATASTORE) and can
> be
> >> checked via vsphere client. the oneadmin has to have the same uid in esx
> and
> >> linux. We also checked this, by starting a placed VM by hand via vsphere
> >> client.
> >> - describing Arch=i686 is necessarily (as you did).
> >> - activate logging for VMM-Driver (because I assume your TM-Driver works
> >> properly):
> >> if your oned.conf looks like:
> >> #  VMware Driver Addon Virtualization Driver Manager Configuration
> >>
> >>
> #-------------------------------------------------------------------------------
> >> VM_MAD = [
> >>    name       = "vmm_vmware",
> >>    executable = "one_vmm_sh",
> >>    arguments  = "vmware",
> >>    default    = "vmm_sh/vmm_sh_vmware.conf",
> >>    type       = "vmware" ]
> >>
> >> than the file should contains the following input:
> >> file:/srv/cloud/one/etc/vmm_sh/vmm_sh_vmwarerc
> >> # Uncomment the following line to active MAD debug
> >> ONE_MAD_DEBUG="1"
> >>
> >> For every driver it should work with editing the following file - but
> I'm
> >> not really sure about that:
> >> /srv/cloud/one/etc/defaultrc
> >>
> >> Logs are placed in /srv/cloud/one/var
> >>
> >> Perhaps I could helped you,
> >>
> >> Matthias Keller
> >>
> >>
> >>
> >>
> >
> >
> >
> > --
> > Luigi Fortunati
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.opennebula.org
> > http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> >
> >
>



-- 
Luigi Fortunati
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20110217/fc101e24/attachment-0003.htm>


More information about the Users mailing list