Hi Matthias,<div>First of all, thanks for the reply.</div><div>I checked on the logs that come up in the "var" folder when the "ONE_MAD_DEBUG" option is set to 1 in $ONE_LOCATION/etc/defaultrc file (enabling logging for all the vmware drivers) but the content seems a bit cryptic and unhelpful.</div>
<div><br></div><div>Anyhow I managed to solve the problem regarding the launch of a new VM. I realized that the cluster hosts works with root users when managing files on the nfs repository. Given the settings of my NFS server I noticed that root squashing was enabled so the root user of the cluster hosts was mapped as user with UID 65534(user "nobody") on the opennebula frontend host. Therefore the root user of the cluster hosts couldn't access files that were placed on the shared image repository because there weren't enough permissions.</div>
<div>I decided to solve this problem by modifying the /etx/exports file one the opennebula frontend</div><div><br></div><div><font face="'courier new', monospace">/srv/cloud/one/var <ip address/mask>(rw,all_squash_,anonuid=<id of 'oneadmin' user>, anongid=<id og 'cloud' group>)</font></div>
<div><br></div><div>with these settings all users of the cluster hosts are mapped to oneadmin:cloud on the frontend.</div><div><br></div><div>Up to now I can start a VM that uses nonpersistent images, but I can't start VM with persistent images.</div>
<div>This is the output of vm.log:</div><div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [DiM][I]: New VM state is ACTIVE.</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [LCM][I]: New VM state is PROLOG.</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VM][I]: Virtual Machine has no context</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [TM][I]: tm_ln.sh: Creating directory /srv/cloud/one/var/49/images</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "mkdir -p /srv/cloud/one/var/49/images".</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "chmod a+w /srv/cloud/one/var/49/images".</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [TM][I]: tm_ln.sh: Link /srv/cloud/one/var/images/7594aeafecbabec0de9da508cf2500fb486675a6</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [TM][I]: tm_ln.sh: Executed "ln -s ../../images/7594aeafecbabec0de9da508cf2500fb486675a6 /srv/cloud/</font></div>
<div><font face="'courier new', monospace">one/var/49/images/disk.0".</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [LCM][I]: New VM state is BOOT</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: Generating deployment file: /srv/cloud/one/var/49/deployment.0</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: Command execution fail: /srv/cloud/one/lib/remotes/vmm/vmware/deploy <a href="http://custom6.sns.it" target="_blank">custom6.sns.it</a> /srv/c</font></div>
<div><font face="'courier new', monospace">loud/one/var/49/deployment.0</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: STDERR follows.</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: [VMWARE] cmd failed [/srv/cloud/one/bin/tty_expect -u oneadmin -p password1234 virsh -c esx://<a href="http://custom6.sns.it?no_verify=1" target="_blank">custom6.sns.it?no_verify=1</a> define /srv/cloud/one/var/49/deployment.0]. Stderr:</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: error: Failed to define domain from /srv/cloud/one/var/49/deployment.0</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: error: internal error HTTP response code 500 for upload to '<a href="https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images" target="_blank">https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images</a>'</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: </font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: . Stdout: ExitCode: 1</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: ExitCode: 1</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][E]: Error deploying virtual machine</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [DiM][I]: New VM state is FAILED</font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [TM][W]: Ignored: LOG - 49 tm_delete.sh: Deleting /srv/cloud/one/var/49/images</font></div>
<div><font face="'courier new', monospace"><br></font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [TM][W]: Ignored: LOG - 49 tm_delete.sh: Executed "rm -rf /srv/cloud/one/var/49/images".</font></div>
<div><font face="'courier new', monospace"><br></font></div><div><font face="'courier new', monospace">Wed Feb 2 15:51:54 2011 [TM][W]: Ignored: TRANSFER SUCCESS 49 -</font></div>
<div><font face="'courier new', monospace"><br></font></div><div><font face="'courier new', monospace">Wed Feb 2 16:38:54 2011 [DiM][I]: New VM state is DONE.</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 16:38:54 2011 [HKM][I]: Command execution fail: /srv/cloud/one/share/hooks/image.rb 49</font></div><div><font face="'courier new', monospace">Wed Feb 2 16:38:54 2011 [HKM][I]: STDERR follows.</font></div>
<div><font face="'courier new', monospace">Wed Feb 2 16:38:54 2011 [HKM][I]: ExitCode: 255</font></div><div><font face="'courier new', monospace">Wed Feb 2 16:38:54 2011 [HKM][E]: Error executing Hook: image.</font></div>
</div><div><br></div><div><font face="arial, helvetica, sans-serif">Given that openebula creates a soft link in directory $ONE_LOCATION/var/49/images/ pointing to the dir of the persistent image and then writes in the deployment.0 file (the xml file in </font><span style="font-family:arial, helvetica, sans-serif">$ONE_LOCATION/var/49/) the path of the vmdk disk (</span><span style="font-family:arial, helvetica, sans-serif"><source file='[images] 49/images/disk.0/disk.vmdk'/>), the use of a soft link seems to be the problem.</span></div>
<div><span style="font-family:arial, helvetica, sans-serif">I tried to change the path to the vmdk file in the deployment.0 file like this:</span></div><div><span style="font-family:arial, helvetica, sans-serif"><source file='[images] images/7594aeafecbabec0de9da508cf2500fb486675a6/disk.vmdk'/></span></div>
<div><span style="font-family:arial, helvetica, sans-serif">so that now it points to the disk directly, withouth passing through the soft link...</span></div><div><span style="font-family:arial, helvetica, sans-serif"><br>
</span></div><div><span style="font-family:arial, helvetica, sans-serif">and then launching the virsh command that define the new domain on the cluster host:</span></div><div><span style="font-family:arial, helvetica, sans-serif"><span style="font-family:'courier new', monospace">/srv/cloud/one/bin/tty_expect -u oneadmin -p custom2011 virsh -c esx://<a href="http://custom6.sns.it?no_verify=1" target="_blank">custom6.sns.it?no_verify=1</a> define /srv/cloud/one/var/49/deployment.0</span></span></div>
<div><font face="arial, helvetica, sans-serif">This last attempt resolved correctly.</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">Why can't it work with symlinks?</font></div>
<div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">Moreover, given this log entry from vm.log:</font></div><div>
<font face="arial, helvetica, sans-serif"><span style="font-family:'courier new', monospace">Wed Feb 2 15:51:54 2011 [VMM][I]: error: internal error HTTP response code 500 for upload to '<a href="https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images" target="_blank">https://custom6.sns.it:443/folder/49%2fimages%2fdisk%2e0/one%2d49.vmx?dcPath=ha%2ddatacenter&dsName=images</a>'</span></font></div>
<div><font face="'courier new', monospace"><br></font></div><div><font face="arial, helvetica, sans-serif">What does it mean?</font></div><div><font face="arial, helvetica, sans-serif">If we replace the ascii code the path it's like this: </font><span style="font-family:'courier new', monospace"><a href="https://custom6.sns.it:443/folder/49/images/disk.0/one-49.vmx?dcPath=ha-datacenter&dsName=images" target="_blank">https://custom6.sns.it:443/folder/49/images/disk.0/one-49.vmx?dcPath=ha-datacenter&dsName=images</a></span></div>
<div><font face="arial, helvetica, sans-serif">Does it mean that it can't upload the vmx file to the given folder?</font></div><div>
<font face="arial, helvetica, sans-serif">This problem is probably more related to libvirt. I found good information about libvirt and esx here:</font></div><div><font face="arial, helvetica, sans-serif"><meta http-equiv="content-type" content="text/html; charset=utf-8"><a href="http://libvirt.org/drvesx.html">http://libvirt.org/drvesx.html</a> (this may helps you as it has for me)</font></div>
<div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">Can someone post a deployment.0 file of a vm correctly launched either with persistent and nonpersistent image?</font></div>
<div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif">P.S.: Another thing I noticed is that the vm doesn't get scheduled on a cluster host if you don't set the number of vcpu on the vm template file!</font></div>
<div><br><div class="gmail_quote">On Tue, Feb 1, 2011 at 2:37 PM, Matthias Keller <span dir="ltr"><<a href="mailto:mkeller@upb.de" target="_blank">mkeller@upb.de</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear Luigi,<br>
<br>
referring to your last post today (01.02), I might help. We are in the same situation as we also trying to build up an Opennebula-Cloud with ESXi-Servers. After some headache about my low administration skills and about a bit unclear documentation on the website / configuration process, we figured out to launch VMs on esx-servers. As I'm not able to diagnose your described issues, I try to summaries our experiences:<br>
<br>
- we had to reinstall our opennebula 2 installment, because system-wide installation /etc/oned.conf, etc. didn't work fully, because some scripts and especially scripts of the esx-drivers plugin. So our second try ends - as your installment - in /srv/cloud/one<br>
- after new logging in ONE_LOCATION env was not set correctly, so we had to make sure it's written somewhere at .bashrc and changing user from root by using "su - oneadmin", this initializes the oneadmin-shell environment<br>
- first we tried starting a vm without network to minimize the error-sources.<br>
- we installed (in documentation mentioned) script-fix with sudo rights - this is need to change owner, so the oneadmin of esx is able to access it, in order to start your vm - I guess somewhere near "starting a vm" can your problem be solved.<br>
- the [image] esx volume should be the correct one (DATASTORE) and can be checked via vsphere client. the oneadmin has to have the same uid in esx and linux. We also checked this, by starting a placed VM by hand via vsphere client.<br>
- describing Arch=i686 is necessarily (as you did).<br>
- activate logging for VMM-Driver (because I assume your TM-Driver works properly):<br>
if your oned.conf looks like:<br>
# VMware Driver Addon Virtualization Driver Manager Configuration<br>
#-------------------------------------------------------------------------------<br>
VM_MAD = [<br>
name = "vmm_vmware",<br>
executable = "one_vmm_sh",<br>
arguments = "vmware",<br>
default = "vmm_sh/vmm_sh_vmware.conf",<br>
type = "vmware" ]<br>
<br>
than the file should contains the following input:<br>
file:/srv/cloud/one/etc/vmm_sh/vmm_sh_vmwarerc<br>
# Uncomment the following line to active MAD debug<br>
ONE_MAD_DEBUG="1"<br>
<br>
For every driver it should work with editing the following file - but I'm not really sure about that:<br>
/srv/cloud/one/etc/defaultrc<br>
<br>
Logs are placed in /srv/cloud/one/var<br>
<br>
Perhaps I could helped you,<br>
<font color="#888888"><br>
Matthias Keller<br>
<br>
<br>
<br>
<br>
</font></blockquote></div><br><br clear="all"><br>-- <br>Luigi Fortunati<br>
</div>