[one-users] VMware VMs stay in boot state

Stefan Reichel stefan.reichel at student.hpi.uni-potsdam.de
Sun Jan 31 07:38:18 PST 2010


Hi Tino,

i just had a closer look at the OneVmmVmware.java, you redirect every output of this class to /dev/null so that the normal StdIn and StdOut can be used for communication purpose. I changed the default /dev/null to a file and added some print lines to the code. The file for the errors stays empty but the file for the default output is filled:

Got DEPLOY command
Connect to host
Started
Register VM
Shape VM
Ended DeployVM
Started
Try to power on
Sending: DEPLOY SUCCESS 10 Test-VM-10
Ended OperationsOverVM
Sending done


The line "sending done" is printed out directly after the invocation of the send_message method. The content of the message is also printed out, you can find it after "Sending:".  As you can see the Java code works like a charm, the problem must be situated in a layer above. Perhaps in the ruby code? I would debug it myself but i am clueless about ruby.

Best regards
Stefan


Am 29.01.2010 um 02:18 schrieb Stefan Reichel:

> Hi Tino,
> 
> i tried your command with 5 parameters, i think you missed the checkpoint. The result is easy to describe: there is nothing. The script itself hangs and the log files don't contain any failure or success.
> Therefore i tried the java class directly by calling:
> 
> java -Ddebug=1 OneVmmVmware --username oneadmin --password xxxx --ignorecert
> 
> And pasting:
> 
> DEPLOY 1 fqdn /usr/share/one/var/1/deployment.0 CP1
> 
> The script itself seems to work because of the error i get(at the end of this document). I also got once another error which was connected to the network, but think this was caused by network misconfiguration. Nevertheless i included also that log and the network file. The oned.log is also quiet useless after the "prolog success" message the monitoring  begins, no deploy success at all. I also saw once after it a line "failure:     " without any reason. Perhaps this is connected to the java output below, because there is also no reason. In that case it would be probably caused by a race condition, which would also explain why it only happens sometimes.
> 
> I hope the output and descriptions give you an indication of how to find the reason for our problem.
> 
> Best regards
> Stefan
> 
> 
> 
> 
> 
> 
> 
> Output of java -Ddebug=1 OneVmmVmware --username oneadmin --password xxxx --ignorecert:
> 
> DEPLOY 1 fqdn /usr/share/one/var/1/deployment.0 CP1
> DEPLOY FAILURE 1 Failed deploying VM in host fqdn.
> [29.01.2010 01:52:17] Failed deploying VM 1 into fqdn.Reason: null
> ---- Debug stack trace ----
> AxisFault
>  faultCode: ServerFaultCode
>  faultSubcode: 
>  faultString: The attempted operation cannot be performed in the current state (Powered On).
>  faultActor: 
>  faultNode: 
>  faultDetail: 
> 	{urn:vim2}InvalidPowerStateFault:<requestedState>poweredOn</requestedState><existingState>poweredOn</existingState>
> 
> The attempted operation cannot be performed in the current state (Powered On).
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
> 	at java.lang.Class.newInstance0(Class.java:372)
> 	at java.lang.Class.newInstance(Class.java:325)
> 	at org.apache.axis.encoding.ser.BeanDeserializer.<init>(BeanDeserializer.java:104)
> 	at org.apache.axis.encoding.ser.BeanDeserializer.<init>(BeanDeserializer.java:90)
> 	at com.vmware.vim.InvalidPowerState.getDeserializer(InvalidPowerState.java:156)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:616)
> 	......
> 	at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206)
> 	at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
> 	at org.apache.axis.client.Call.invoke(Call.java:2767)
> 	at org.apache.axis.client.Call.invoke(Call.java:2443)
> 	at org.apache.axis.client.Call.invoke(Call.java:2366)
> 	at org.apache.axis.client.Call.invoke(Call.java:1812)
> 	at com.vmware.vim.VimBindingStub.powerOnVM_Task(VimBindingStub.java:24320)
> 	at OperationsOverVM.powerOn(OperationsOverVM.java:82)
> 	at OneVmmVmware.loop(OneVmmVmware.java:204)
> 	at OneVmmVmware.main(OneVmmVmware.java:57)
> [29.01.2010 01:52:17] ---------------------------
> 
> 
> 
> 
> 
> 
> Outputof java -Ddebug=1 OneVmmVmware .... based on network misconfiguration?
> 
> DEPLOY FAILURE 0 Failed deploying VM in host fqdn.
> [29.01.2010 01:10:43] Failed deploying VM 0 into fqdn.Reason: null
> ---- Debug stack trace ----
> java.lang.NullPointerException
> 	at DeployVM.configureNetwork(DeployVM.java:268)
> 	at DeployVM.shapeVM(DeployVM.java:220)
> 	at OneVmmVmware.loop(OneVmmVmware.java:168)
> 	at OneVmmVmware.main(OneVmmVmware.java:57)
> [29.01.2010 01:10:43] ---------------------------
> 
> 
> 
> 
> Old network config:
> 
> NAME   = "VMWareNet"
> TYPE   = RANGED
> BRIDGE = NAT
> NETWORK_ADDRESS = 192.168.189.200
> NETWORK_SIZE = 254
> 
> 
> 
> 
> Oned.log (extract)
> Fri Jan 29 00:55:28 2010 [TM][D]: Message received: TRANSFER SUCCESS 0 -
> 
> Fri Jan 29 00:55:28 2010 [LCM][I]: prolog success:
> Fri Jan 29 00:55:39 2010 [VMM][I]: Recovering VMM drivers
> Fri Jan 29 00:56:03 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
> Fri Jan 29 00:56:06 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
> Fri Jan 29 00:56:51 2010 [VMM][I]: Monitoring VM 86.
> Fri Jan 29 00:56:54 2010 [InM][I]: Monitoring host fqdn (0)
> Fri Jan 29 00:56:57 2010 [InM][D]: Host 0 successfully monitored.
> Fri Jan 29 00:56:57 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
> Fri Jan 29 00:57:01 2010 [VMM][D]: Message received: POLL SUCCESS 0 STATE=a
> 
> 
> Am 28.01.2010 um 19:37 schrieb Tino Vazquez:
> 
>> Hi Stefan,
>> 
>> Let's try executing the driver by hand. The VMM driver talks with
>> OpenNebula core using an ASCII protocol. So, if you execute the
>> driver:
>> 
>> $ONE_LOCATION/lib/mads/one_vmm_vmware
>> 
>> and hit enter, it should wait for input in the standard input, and you
>> will need to type:
>> 
>> ---8<----
>> DEPLOY 0 fqdn var/77/images/deployment.0
>> --->8----
>> 
>> assuming that $ONE_LOCATION/var/77 exists (i.e. a previous attempt to
>> run a VM with OpenNebula ID 77 has been made, it didn't have to
>> suceed).
>> 
>> Then answer we are waiting for is a DEPLOY 0 SUCCESS, this is what the
>> OpenNebula core seems to be not getting.
>> 
>> Your use case is very interesting, we are happy to help. OpenNebula
>> doesn't feature a web GUIU per-se, but we offer a REST interface using
>> EC2 or OCCI, over which an AJAX application can be build.
>> 
>> Best regards,
>> 
>> -Tino
>> 
>> --
>> Constantino Vázquez, Grid & Virtualization Technology
>> Engineer/Researcher: http://www.dsa-research.org/tinova
>> DSA Research Group: http://dsa-research.org
>> Globus GridWay Metascheduler: http://www.GridWay.org
>> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>> 
>> 
>> 
>> On Thu, Jan 28, 2010 at 1:05 AM, Stefan Reichel
>> <Stefan.Reichel at student.hpi.uni-potsdam.de> wrote:
>>> Hi Tino,
>>> i just wanted to write, that everything is fine, but it isn't. The problem
>>> only occurs sometimes. At the end of this mail you will find some logs.
>>> VM(77) is up and running in the VM-Server, but in the same time in "boot"
>>> state. By the way we currently use VMware-server-2.0.2-203138.i386 .
>>> As you can see in the logs, the "DEPLOY SUCCESS" message is send at least
>>> this believes the OneVmmVmware.java. But it seems that it is never received.
>>> Sometimes it works, but this is not deterministic, in that case it would
>>> also be appear in oned.log.
>>> 
>>> 
>>> The main goal of our project is to setup a network environment, in which
>>> students can test and investigate several security weaknesses on live
>>> systems. For every such scenario we need different computers which are
>>> simulated via VMs. In effect when a new scenario is loaded, OpenNebula will
>>> be responsable for the VM setup and management.
>>> We will primarily use VMWare images(with VM-Server) but also KVM. To control
>>> our scenarios we will implement a webinterface, which will be used for
>>> management but also monitoring purpose. As far as i know, OpenNebula has
>>> only a command line frontend?
>>> Best Regards
>>> Stefan
>>> 
>>> 
>>> VM.LOG :
>>> Wed Jan 27 23:57:41 2010 [DiM][I]: New VM state is ACTIVE.
>>> Wed Jan 27 23:57:42 2010 [LCM][I]: New VM state is PROLOG.
>>> Wed Jan 27 23:57:42 2010 [VM][I]: Virtual Machine has no context
>>> Wed Jan 27 23:58:46 2010 [TM][I]: tm_clone.sh:
>>> fqdn:/srv/seclab/images-src/vmware/XP2 fqdn:/srv/seclab/vms/77/images/disk.0
>>> Wed Jan 27 23:58:46 2010 [TM][I]: tm_clone.sh: Cloning
>>> fqdn:/srv/seclab/images-src/vmware/XP2
>>> Wed Jan 27 23:58:46 2010 [LCM][I]: New VM state is BOOT
>>> Wed Jan 27 23:58:46 2010 [VMM][I]: Generating deployment file:
>>> /usr/share/one/var/77/deployment.0
>>> 
>>> ONED.LOG :
>>> Wed Jan 27 23:57:41 2010 [DiM][D]: Deploying VM 77
>>> Wed Jan 27 23:57:44 2010 [InM][I]: Monitoring host fqdn (0)
>>> Wed Jan 27 23:57:48 2010 [InM][D]: Host 0 successfully monitored.
>>> Wed Jan 27 23:57:52 2010 [VMM][I]: Monitoring VM 76.
>>> Wed Jan 27 23:57:53 2010 [VMM][D]: Message received: POLL SUCCESS 76 STATE=a
>>> USEDMEMORY=25 USEDCPU=0
>>> Wed Jan 27 23:58:46 2010 [TM][D]: Message received: LOG - 77 tm_clone.sh:
>>> fqdn:/srv/seclab/images-src/vmware/XP2 fqdn:/srv/seclab/vms/77/images/disk.0
>>> Wed Jan 27 23:58:46 2010 [TM][D]: Message received: LOG - 77 tm_clone.sh:
>>> Cloning fqdn:/srv/seclab/images-src/vmware/XP2
>>> Wed Jan 27 23:58:46 2010 [TM][D]: Message received: TRANSFER SUCCESS 77 -
>>> Wed Jan 27 23:58:46 2010 [LCM][I]: prolog success:
>>> 
>>> VMM_VMWARE.LOG (Output added to OneVmmVWare.java)
>>> [27.01.2010 23:41:40] TRY TO POWER ON
>>> [27.01.2010 23:41:45] DEPLOY SUCCESS
>>> [27.01.2010 23:58:50] TRY TO POWER ON
>>> [27.01.2010 23:58:54] DEPLOY SUCCESS
>>> Am 27.01.2010 um 13:04 schrieb Tino Vazquez:
>>> 
>>> Hi Stefan,
>>> 
>>> comments inline,
>>> 
>>> On Wed, Jan 27, 2010 at 10:27 AM, Stefan Reichel
>>> <stefan.reichel at student.hpi.uni-potsdam.de> wrote:
>>> 
>>> Hi,
>>> 
>>> i tried to analyze the bug and finally solve this problem. For now these are
>>> my results, please correct my if i am wrong.
>>> 
>>> First of all,  the VM is running in VMware but in the "onevm list" it  is
>>> still booting. The VirtualMachineManager::deploy_action was finished. These
>>> were the facts now my theory:
>>> 
>>> Normally the MadManager will receive a message, and forward it to the
>>> corresponding VM-Driver by calling the protocol method. In my case this
>>> would be the VirtualMachineManagerDriver.  Nevertheless its protocol method
>>> is not called and therefore it can't call the LifeCycleManager, which would
>>> in effect set the "running" state, after reacting on a "DEPLOY_SUCCESS".
>>> Therefore i assume, that the corresponding message is never send. But who
>>> should send it???
>>> 
>>> The VMware VMM mad is responsible to send back the DEPLOY SUCESS, so
>>> it is probably failing to do so. you mentioned in a previous email
>>> that the VM is already running, so I guess the driver is crashing
>>> badly after performing the powerOn (otherwise it will send the "DEPLOY
>>> FAILED " and you would get a "fail" instead of a "boot" in the VM
>>> state). Do you see anything in the one_vmm_vmware log file?
>>> 
>>> 
>>> I temporary fixed the problem by setting the running state manually in the
>>> mentioned deploy_action. I hope that someone will finally answer one of my
>>> messages. Indeed its my third unanswered? message to this community.
>>> 
>>> Sorry for the delay in my answers, please take into account that this
>>> is a best effort support mailing list.
>>> 
>>> 
>>> We try to use opennebula in our current project, but the focus of our
>>> project is not to get software to do what it is used to do. Nevertheless we
>>> are software developers and therefore  we could also fix and extend the
>>> openNebula project if there would be any support from you.
>>> 
>>> That is great news!! Could you please elaborate a bit on what is that
>>> you intend to do with the software? We are happy to provide best
>>> effort support.
>>> 
>>> Best regards,
>>> 
>>> -Tino
>>> 
>>> 
>>> Kind regards,
>>> 
>>> Stefan
>>> 
>>> 
>>> 
>>> 
>>> Am 26.01.2010 um 21:41 schrieb Stefan Reichel:
>>> 
>>> Hi OpenNebula team,
>>> 
>>> our developer team tried to use OpenNebula  and now we are able to start vms
>>> in VMWare. But we have a serious problem. Every VM we start stays in the
>>> "boot" state. What is the reason for that, where ca we gather more
>>> information about the problem? We use OpenNebula 1.4 / SVN version on Ubuntu
>>> 9.10 in combination with VMWare Server 2. Any help would be appreciated.
>>> 
>>> Sincerely
>>> 
>>> Stefan
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> 
>>> Users mailing list
>>> 
>>> Users at lists.opennebula.org
>>> 
>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Constantino Vázquez, Grid & Virtualization Technology
>>> Engineer/Researcher: http://www.dsa-research.org/tinova
>>> DSA Research Group: http://dsa-research.org
>>> Globus GridWay Metascheduler: http://www.GridWay.org
>>> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>>> 
>>> _____________________________________
>>> 
>>> Stefan Reichel,  M.Sc. Candidate
>>> 
>>> Hasso-Plattner-Institut für Softwaresystemtechnik GmbH
>>> Postfach 900460, D-14440 Potsdam, Germany
>>> http://www.hpi.uni-potsdam.de
>>> Telefon: 03322/206306  Mobile: 0178/5495023
>>> Email: stefan.reichel at student.hpi.uni-potsdam.de
>>> _____________________________________
>>> 
> 
> _____________________________________
> 
> Stefan Reichel,  M.Sc. Candidate
> 
> Hasso-Plattner-Institut für Softwaresystemtechnik GmbH 
> Postfach 900460, D-14440 Potsdam, Germany 
> http://www.hpi.uni-potsdam.de
> 
> Telefon: 03322/206306  Mobile: 0178/5495023 
> Email: stefan.reichel at student.hpi.uni-potsdam.de
> _____________________________________
> 
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

_____________________________________

Stefan Reichel,  M.Sc. Candidate

Hasso-Plattner-Institut für Softwaresystemtechnik GmbH 
Postfach 900460, D-14440 Potsdam, Germany 
http://www.hpi.uni-potsdam.de

Telefon: 03322/206306  Mobile: 0178/5495023 
Email: stefan.reichel at student.hpi.uni-potsdam.de
_____________________________________

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20100131/b4455c16/attachment-0003.htm>


More information about the Users mailing list