[one-users] Problems when booting VM (OpenNebula 2.0.1 and ESXi 4.1)

Tino Vazquez tinova at fdi.ucm.es
Mon Feb 21 03:37:24 PST 2011


Hi Luigi,

Ok, I'll have to test that with 4.1, it certainly works with soft
links in previous versions.

Regards,

-Tino

--
Constantino Vázquez Blanco | dsa-research.org/tinova
Virtualization Technology Engineer / Researcher
OpenNebula Toolkit | opennebula.org



On Thu, Feb 17, 2011 at 11:16 AM, Luigi Fortunati
<luigi.fortunati at gmail.com> wrote:
> Hi,
> I fixed the problem related to the use of symlink.
> I found out that by making a symlink to a directory (which is what
> OpenNebula does) my ESXi 4.1 server sees the directory not named as
> "disk.0", but with the same name of the directory in the image repository
> (which is like "44bdf63hf63kd73nfetcetc..."). You can check that out by
> browsing the "images" repository with VSphere Client. I decided to change
> the script vmware/tm_ln.sh in order to create a directory disk.0 in the
>  $ONE_LOCATION/var/<vmID>/images/ directory and then create hardlinks to the
> vmdk files in the image repo, and now persistent images are working. :-)
>
> On Mon, Feb 7, 2011 at 5:15 PM, Tino Vazquez <tinova at fdi.ucm.es> wrote:
>>
>> Hi Luigi,
>>
>> There is indeed a bug for system-wide installations, we are working on
>> it and will be fixed in the upcoming 2.2 release.
>>
>> About the symlink, we got tests for the Image catalog in VMware that
>> shows it working for persistent and not persistent images. The
>> template is trivial, just using a image flagged as persistent within a
>> VM. Maybe the problem lies somewhere else, like for instance, are you
>> using VMFS? In our tests we use NFS.
>>
>> About the VCPU, the scheduler completely ignores this parameter, so it
>> shouldn't make a difference for scheduling.
>>
>> Regards,
>>
>> -Tino
>>
>> --
>> Constantino Vázquez Blanco | dsa-research.org/tinova
>> Virtualization Technology Engineer / Researcher
>> OpenNebula Toolkit | opennebula.org
>>
>>
>>
>> On Thu, Feb 3, 2011 at 11:08 AM, Luigi Fortunati
>> <luigi.fortunati at gmail.com> wrote:
>> > Hi Matthias,
>> > First of all, thanks for the reply.
>> > I checked on the logs that come up in the "var" folder when the
>> > "ONE_MAD_DEBUG" option is set to 1 in $ONE_LOCATION/etc/defaultrc file
>> > (enabling logging for all the vmware drivers) but the content seems a
>> > bit
>> > cryptic and unhelpful.
>> > Anyhow I managed to solve the problem regarding the launch of a new VM.
>> > I
>> > realized that the cluster hosts works with root users when managing
>> > files on
>> > the nfs repository. Given the settings of my NFS server I noticed that
>> > root
>> > squashing was enabled so the root user of the cluster hosts was mapped
>> > as
>> > user with UID 65534(user "nobody") on the opennebula frontend host.
>> > Therefore the root user of the cluster hosts couldn't access files that
>> > were
>> > placed on the shared image repository because there weren't enough
>> > permissions.
>> > I decided to solve this problem by modifying the /etx/exports file one
>> > the
>> > opennebula frontend
>> > /srv/cloud/one/var  <ip address/mask>(rw,all_squash_,anonuid=<id of
>> > 'oneadmin' user>, anongid=<id og 'cloud' group>)
>> > with these settings all users of the cluster hosts are mapped to
>> > oneadmin:cloud on the frontend.
>> > Up to now I can start a VM that uses nonpersistent images, but I can't
>> > start
>> > VM with persistent images.
>> > This is the output of vm.log:
>> > Wed Feb  2 15:51:54 2011 [DiM][I]: New VM state is ACTIVE.
>> > Wed Feb  2 15:51:54 2011 [LCM][I]: New VM state is PROLOG.
>> > Wed Feb  2 15:51:54 2011 [VM][I]: Virtual Machine has no context
>> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Creating directory
>> > /srv/cloud/one/var/49/images
>> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "mkdir -p
>> > /srv/cloud/one/var/49/images".
>> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "chmod a+w
>> > /srv/cloud/one/var/49/images".
>> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Link
>> > /srv/cloud/one/var/images/7594aeafecbabec0de9da508cf2500fb486675a6
>> > Wed Feb  2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "ln -s
>> > ../../images/7594aeafecbabec0de9da508cf2500fb486675a6 /srv/cloud/
>> > one/var/49/images/disk.0".
>> > Wed Feb  2 15:51:54 2011 [LCM][I]: New VM state is BOOT
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: Generating deployment file:
>> > /srv/cloud/one/var/49/deployment.0
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: Command execution fail:
>> > /srv/cloud/one/lib/remotes/vmm/vmware/deploy custom6.sns.it /srv/c
>> > loud/one/var/49/deployment.0
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: STDERR follows.
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: [VMWARE] cmd failed
>> > [/srv/cloud/one/bin/tty_expect -u oneadmin -p password1234 virsh -c
>> > esx://custom6.sns.it?no_verify=1 define
>> > /srv/cloud/one/var/49/deployment.0].
>> > Stderr:
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: error: Failed to define domain from
>> > /srv/cloud/one/var/49/deployment.0
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: error: internal error HTTP response
>> > code
>> > 500 for upload to
>> >
>> > 'https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images'
>> > Wed Feb  2 15:51:54 2011 [VMM][I]:
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: . Stdout: ExitCode: 1
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: ExitCode: 1
>> > Wed Feb  2 15:51:54 2011 [VMM][E]: Error deploying virtual machine
>> > Wed Feb  2 15:51:54 2011 [DiM][I]: New VM state is FAILED
>> > Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: LOG - 49 tm_delete.sh:
>> > Deleting
>> > /srv/cloud/one/var/49/images
>> > Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: LOG - 49 tm_delete.sh:
>> > Executed
>> > "rm -rf /srv/cloud/one/var/49/images".
>> > Wed Feb  2 15:51:54 2011 [TM][W]: Ignored: TRANSFER SUCCESS 49 -
>> > Wed Feb  2 16:38:54 2011 [DiM][I]: New VM state is DONE.
>> > Wed Feb  2 16:38:54 2011 [HKM][I]: Command execution fail:
>> > /srv/cloud/one/share/hooks/image.rb 49
>> > Wed Feb  2 16:38:54 2011 [HKM][I]: STDERR follows.
>> > Wed Feb  2 16:38:54 2011 [HKM][I]: ExitCode: 255
>> > Wed Feb  2 16:38:54 2011 [HKM][E]: Error executing Hook: image.
>> > Given that openebula creates a soft link in directory
>> > $ONE_LOCATION/var/49/images/ pointing to the dir of the persistent image
>> > and
>> > then writes in the deployment.0 file (the xml file
>> > in $ONE_LOCATION/var/49/)
>> > the path of the vmdk disk (<source file='[images]
>> > 49/images/disk.0/disk.vmdk'/>), the use of a soft link seems to be the
>> > problem.
>> > I tried to change the path to the vmdk file in the deployment.0 file
>> > like
>> > this:
>> > <source file='[images]
>> > images/7594aeafecbabec0de9da508cf2500fb486675a6/disk.vmdk'/>
>> > so that now it points to the disk directly, withouth passing through the
>> > soft link...
>> > and then launching the virsh command that define the new domain on the
>> > cluster host:
>> > /srv/cloud/one/bin/tty_expect -u oneadmin -p custom2011 virsh -c
>> > esx://custom6.sns.it?no_verify=1 define
>> > /srv/cloud/one/var/49/deployment.0
>> > This last attempt resolved correctly.
>> > Why can't it work with symlinks?
>> > Moreover, given this log entry from vm.log:
>> > Wed Feb  2 15:51:54 2011 [VMM][I]: error: internal error HTTP response
>> > code
>> > 500 for upload to
>> >
>> > 'https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images'
>> > What does it mean?
>> > If we replace the ascii code the path it's like
>> >
>> > this: https://custom6.sns.it:443/folder/49/images/disk.0/one-49.vmx?dcPath=ha-datacenter&dsName=images
>> > Does it mean that it can't upload the vmx file to the given folder?
>> > This problem is probably more related to libvirt. I found good
>> > information
>> > about libvirt and esx here:
>> > http://libvirt.org/drvesx.html (this may helps you as it has for me)
>> > Can someone post a deployment.0 file of a vm correctly launched either
>> > with
>> > persistent and nonpersistent image?
>> > P.S.: Another thing I noticed is that the vm doesn't get scheduled on a
>> > cluster host if you don't set the number of vcpu on the vm template
>> > file!
>> > On Tue, Feb 1, 2011 at 2:37 PM, Matthias Keller <mkeller at upb.de> wrote:
>> >>
>> >> Dear Luigi,
>> >>
>> >> referring to your last post today (01.02), I might help. We are in the
>> >> same situation as we also trying to build up an Opennebula-Cloud with
>> >> ESXi-Servers. After some headache about my low administration skills
>> >> and
>> >> about a bit unclear documentation on the website / configuration
>> >> process, we
>> >> figured out to launch VMs on esx-servers. As I'm not able to diagnose
>> >> your
>> >> described issues, I try to summaries our experiences:
>> >>
>> >> - we had to reinstall our opennebula 2 installment, because system-wide
>> >> installation /etc/oned.conf, etc. didn't work fully, because some
>> >> scripts
>> >> and especially scripts of the esx-drivers plugin. So our second try
>> >> ends -
>> >> as your installment - in /srv/cloud/one
>> >> - after new logging in ONE_LOCATION env was not set correctly, so we
>> >> had
>> >> to make sure it's written somewhere at .bashrc and changing user from
>> >> root
>> >> by using "su - oneadmin", this initializes the oneadmin-shell
>> >> environment
>> >> - first we tried starting a vm without network to minimize the
>> >> error-sources.
>> >> - we installed (in documentation mentioned) script-fix with sudo rights
>> >> -
>> >> this is need to change owner, so the oneadmin of esx is able to access
>> >> it,
>> >> in order to start your vm - I guess somewhere near "starting a vm" can
>> >> your
>> >> problem be solved.
>> >> - the [image] esx volume should be the correct one (DATASTORE) and can
>> >> be
>> >> checked via vsphere client. the oneadmin has to have the same uid in
>> >> esx and
>> >> linux. We also checked this, by starting a placed VM by hand via
>> >> vsphere
>> >> client.
>> >> - describing Arch=i686 is necessarily (as you did).
>> >> - activate logging for VMM-Driver (because I assume your TM-Driver
>> >> works
>> >> properly):
>> >> if your oned.conf looks like:
>> >> #  VMware Driver Addon Virtualization Driver Manager Configuration
>> >>
>> >>
>> >> #-------------------------------------------------------------------------------
>> >> VM_MAD = [
>> >>    name       = "vmm_vmware",
>> >>    executable = "one_vmm_sh",
>> >>    arguments  = "vmware",
>> >>    default    = "vmm_sh/vmm_sh_vmware.conf",
>> >>    type       = "vmware" ]
>> >>
>> >> than the file should contains the following input:
>> >> file:/srv/cloud/one/etc/vmm_sh/vmm_sh_vmwarerc
>> >> # Uncomment the following line to active MAD debug
>> >> ONE_MAD_DEBUG="1"
>> >>
>> >> For every driver it should work with editing the following file - but
>> >> I'm
>> >> not really sure about that:
>> >> /srv/cloud/one/etc/defaultrc
>> >>
>> >> Logs are placed in /srv/cloud/one/var
>> >>
>> >> Perhaps I could helped you,
>> >>
>> >> Matthias Keller
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Luigi Fortunati
>> >
>> > _______________________________________________
>> > Users mailing list
>> > Users at lists.opennebula.org
>> > http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>> >
>> >
>
>
>
> --
> Luigi Fortunati
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>



More information about the Users mailing list