[one-users] VMware VMs stay in boot state
Stefan Reichel
Stefan.Reichel at student.hpi.uni-potsdam.de
Mon Feb 1 16:04:39 PST 2010
Hi Tino,
you are right, the output was collected while the machines were still running. The behavior of the vmm_vmware driver was correct. I also send you a mail on sunday with additional output, which also underlines that the Java part of the driver works as expected. Therefore i assume the problem must be somewhere in the other parts of the driver.
Best regards
Stefan
Am 01.02.2010 um 12:55 schrieb Tino Vazquez:
> Hi Stefan,
>
> For what I read in the java stack trace, the machine is already
> powered on, that is why is failing. This may happen when you kill
> OpenNebula (without letting it shutdown the VMs), clear it's DB and
> try to submit VMs again, the names will clash (if one-1 is still
> running, a new deployment of a one-1 will fail).
>
> If this is not the case, please let me know and we will look at something else.
>
> Regards,
>
> -Tino
>
> --
> Constantino Vázquez, Grid & Virtualization Technology
> Engineer/Researcher: http://www.dsa-research.org/tinova
> DSA Research Group: http://dsa-research.org
> Globus GridWay Metascheduler: http://www.GridWay.org
> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>
>
>
> On Fri, Jan 29, 2010 at 2:18 AM, Stefan Reichel
> <Stefan.Reichel at student.hpi.uni-potsdam.de> wrote:
>> Hi Tino,
>> i tried your command with 5 parameters, i think you missed the checkpoint.
>> The result is easy to describe: there is nothing. The script itself hangs
>> and the log files don't contain any failure or success.
>> Therefore i tried the java class directly by calling:
>> java -Ddebug=1 OneVmmVmware --username oneadmin --password xxxx --ignorecert
>> And pasting:
>> DEPLOY 1 fqdn /usr/share/one/var/1/deployment.0 CP1
>> The script itself seems to work because of the error i get(at the end of
>> this document). I also got once another error which was connected to the
>> network, but think this was caused by network misconfiguration. Nevertheless
>> i included also that log and the network file. The oned.log is also quiet
>> useless after the "prolog success" message the monitoring begins, no deploy
>> success at all. I also saw once after it a line "failure: " without any
>> reason. Perhaps this is connected to the java output below, because there is
>> also no reason. In that case it would be probably caused by a race
>> condition, which would also explain why it only happens sometimes.
>> I hope the output and descriptions give you an indication of how to find the
>> reason for our problem.
>> Best regards
>> Stefan
>>
>>
>>
>>
>>
>>
>> Output of java -Ddebug=1 OneVmmVmware --username oneadmin --password xxxx
>> --ignorecert:
>> DEPLOY 1 fqdn /usr/share/one/var/1/deployment.0 CP1
>> DEPLOY FAILURE 1 Failed deploying VM in host fqdn.
>> [29.01.2010 01:52:17] Failed deploying VM 1 into fqdn.Reason: null
>> ---- Debug stack trace ----
>> AxisFault
>> faultCode: ServerFaultCode
>> faultSubcode:
>> faultString: The attempted operation cannot be performed in the current
>> state (Powered On).
>> faultActor:
>> faultNode:
>> faultDetail:
>> {urn:vim2}InvalidPowerStateFault:<requestedState>poweredOn</requestedState><existingState>poweredOn</existingState>
>> The attempted operation cannot be performed in the current state (Powered
>> On).
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
>> at java.lang.Class.newInstance0(Class.java:372)
>> at java.lang.Class.newInstance(Class.java:325)
>> at
>> org.apache.axis.encoding.ser.BeanDeserializer.<init>(BeanDeserializer.java:104)
>> at
>> org.apache.axis.encoding.ser.BeanDeserializer.<init>(BeanDeserializer.java:90)
>> at
>> com.vmware.vim.InvalidPowerState.getDeserializer(InvalidPowerState.java:156)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:616)
>> ......
>> at org.apache.axis.client.AxisClient.invoke(AxisClient.java:206)
>> at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
>> at org.apache.axis.client.Call.invoke(Call.java:2767)
>> at org.apache.axis.client.Call.invoke(Call.java:2443)
>> at org.apache.axis.client.Call.invoke(Call.java:2366)
>> at org.apache.axis.client.Call.invoke(Call.java:1812)
>> at com.vmware.vim.VimBindingStub.powerOnVM_Task(VimBindingStub.java:24320)
>> at OperationsOverVM.powerOn(OperationsOverVM.java:82)
>> at OneVmmVmware.loop(OneVmmVmware.java:204)
>> at OneVmmVmware.main(OneVmmVmware.java:57)
>> [29.01.2010 01:52:17] ---------------------------
>>
>>
>>
>>
>>
>> Outputof java -Ddebug=1 OneVmmVmware .... based on network misconfiguration?
>> DEPLOY FAILURE 0 Failed deploying VM in host fqdn.
>> [29.01.2010 01:10:43] Failed deploying VM 0 into fqdn.Reason: null
>> ---- Debug stack trace ----
>> java.lang.NullPointerException
>> at DeployVM.configureNetwork(DeployVM.java:268)
>> at DeployVM.shapeVM(DeployVM.java:220)
>> at OneVmmVmware.loop(OneVmmVmware.java:168)
>> at OneVmmVmware.main(OneVmmVmware.java:57)
>> [29.01.2010 01:10:43] ---------------------------
>>
>>
>>
>> Old network config:
>> NAME = "VMWareNet"
>> TYPE = RANGED
>> BRIDGE = NAT
>> NETWORK_ADDRESS = 192.168.189.200
>> NETWORK_SIZE = 254
>>
>>
>>
>> Oned.log (extract)
>> Fri Jan 29 00:55:28 2010 [TM][D]: Message received: TRANSFER SUCCESS 0 -
>> Fri Jan 29 00:55:28 2010 [LCM][I]: prolog success:
>> Fri Jan 29 00:55:39 2010 [VMM][I]: Recovering VMM drivers
>> Fri Jan 29 00:56:03 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
>> Fri Jan 29 00:56:06 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
>> Fri Jan 29 00:56:51 2010 [VMM][I]: Monitoring VM 86.
>> Fri Jan 29 00:56:54 2010 [InM][I]: Monitoring host fqdn (0)
>> Fri Jan 29 00:56:57 2010 [InM][D]: Host 0 successfully monitored.
>> Fri Jan 29 00:56:57 2010 [ReM][D]: VirtualMachinePoolInfo method invoked
>> Fri Jan 29 00:57:01 2010 [VMM][D]: Message received: POLL SUCCESS 0 STATE=a
>>
>> Am 28.01.2010 um 19:37 schrieb Tino Vazquez:
>>
>> Hi Stefan,
>>
>> Let's try executing the driver by hand. The VMM driver talks with
>> OpenNebula core using an ASCII protocol. So, if you execute the
>> driver:
>>
>> $ONE_LOCATION/lib/mads/one_vmm_vmware
>>
>> and hit enter, it should wait for input in the standard input, and you
>> will need to type:
>>
>> ---8<----
>> DEPLOY 0 fqdn var/77/images/deployment.0
>> --->8----
>>
>> assuming that $ONE_LOCATION/var/77 exists (i.e. a previous attempt to
>> run a VM with OpenNebula ID 77 has been made, it didn't have to
>> suceed).
>>
>> Then answer we are waiting for is a DEPLOY 0 SUCCESS, this is what the
>> OpenNebula core seems to be not getting.
>>
>> Your use case is very interesting, we are happy to help. OpenNebula
>> doesn't feature a web GUIU per-se, but we offer a REST interface using
>> EC2 or OCCI, over which an AJAX application can be build.
>>
>> Best regards,
>>
>> -Tino
>>
>> --
>> Constantino Vázquez, Grid & Virtualization Technology
>> Engineer/Researcher: http://www.dsa-research.org/tinova
>> DSA Research Group: http://dsa-research.org
>> Globus GridWay Metascheduler: http://www.GridWay.org
>> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>>
>>
>>
>> On Thu, Jan 28, 2010 at 1:05 AM, Stefan Reichel
>> <Stefan.Reichel at student.hpi.uni-potsdam.de> wrote:
>>
>> Hi Tino,
>>
>> i just wanted to write, that everything is fine, but it isn't. The problem
>>
>> only occurs sometimes. At the end of this mail you will find some logs.
>>
>> VM(77) is up and running in the VM-Server, but in the same time in "boot"
>>
>> state. By the way we currently use VMware-server-2.0.2-203138.i386 .
>>
>> As you can see in the logs, the "DEPLOY SUCCESS" message is send at least
>>
>> this believes the OneVmmVmware.java. But it seems that it is never received.
>>
>> Sometimes it works, but this is not deterministic, in that case it would
>>
>> also be appear in oned.log.
>>
>>
>> The main goal of our project is to setup a network environment, in which
>>
>> students can test and investigate several security weaknesses on live
>>
>> systems. For every such scenario we need different computers which are
>>
>> simulated via VMs. In effect when a new scenario is loaded, OpenNebula will
>>
>> be responsable for the VM setup and management.
>>
>> We will primarily use VMWare images(with VM-Server) but also KVM. To control
>>
>> our scenarios we will implement a webinterface, which will be used for
>>
>> management but also monitoring purpose. As far as i know, OpenNebula has
>>
>> only a command line frontend?
>>
>> Best Regards
>>
>> Stefan
>>
>>
>> VM.LOG :
>>
>> Wed Jan 27 23:57:41 2010 [DiM][I]: New VM state is ACTIVE.
>>
>> Wed Jan 27 23:57:42 2010 [LCM][I]: New VM state is PROLOG.
>>
>> Wed Jan 27 23:57:42 2010 [VM][I]: Virtual Machine has no context
>>
>> Wed Jan 27 23:58:46 2010 [TM][I]: tm_clone.sh:
>>
>> fqdn:/srv/seclab/images-src/vmware/XP2 fqdn:/srv/seclab/vms/77/images/disk.0
>>
>> Wed Jan 27 23:58:46 2010 [TM][I]: tm_clone.sh: Cloning
>>
>> fqdn:/srv/seclab/images-src/vmware/XP2
>>
>> Wed Jan 27 23:58:46 2010 [LCM][I]: New VM state is BOOT
>>
>> Wed Jan 27 23:58:46 2010 [VMM][I]: Generating deployment file:
>>
>> /usr/share/one/var/77/deployment.0
>>
>> ONED.LOG :
>>
>> Wed Jan 27 23:57:41 2010 [DiM][D]: Deploying VM 77
>>
>> Wed Jan 27 23:57:44 2010 [InM][I]: Monitoring host fqdn (0)
>>
>> Wed Jan 27 23:57:48 2010 [InM][D]: Host 0 successfully monitored.
>>
>> Wed Jan 27 23:57:52 2010 [VMM][I]: Monitoring VM 76.
>>
>> Wed Jan 27 23:57:53 2010 [VMM][D]: Message received: POLL SUCCESS 76 STATE=a
>>
>> USEDMEMORY=25 USEDCPU=0
>>
>> Wed Jan 27 23:58:46 2010 [TM][D]: Message received: LOG - 77 tm_clone.sh:
>>
>> fqdn:/srv/seclab/images-src/vmware/XP2 fqdn:/srv/seclab/vms/77/images/disk.0
>>
>> Wed Jan 27 23:58:46 2010 [TM][D]: Message received: LOG - 77 tm_clone.sh:
>>
>> Cloning fqdn:/srv/seclab/images-src/vmware/XP2
>>
>> Wed Jan 27 23:58:46 2010 [TM][D]: Message received: TRANSFER SUCCESS 77 -
>>
>> Wed Jan 27 23:58:46 2010 [LCM][I]: prolog success:
>>
>> VMM_VMWARE.LOG (Output added to OneVmmVWare.java)
>>
>> [27.01.2010 23:41:40] TRY TO POWER ON
>>
>> [27.01.2010 23:41:45] DEPLOY SUCCESS
>>
>> [27.01.2010 23:58:50] TRY TO POWER ON
>>
>> [27.01.2010 23:58:54] DEPLOY SUCCESS
>>
>> Am 27.01.2010 um 13:04 schrieb Tino Vazquez:
>>
>> Hi Stefan,
>>
>> comments inline,
>>
>> On Wed, Jan 27, 2010 at 10:27 AM, Stefan Reichel
>>
>> <stefan.reichel at student.hpi.uni-potsdam.de> wrote:
>>
>> Hi,
>>
>> i tried to analyze the bug and finally solve this problem. For now these are
>>
>> my results, please correct my if i am wrong.
>>
>> First of all, the VM is running in VMware but in the "onevm list" it is
>>
>> still booting. The VirtualMachineManager::deploy_action was finished. These
>>
>> were the facts now my theory:
>>
>> Normally the MadManager will receive a message, and forward it to the
>>
>> corresponding VM-Driver by calling the protocol method. In my case this
>>
>> would be the VirtualMachineManagerDriver. Nevertheless its protocol method
>>
>> is not called and therefore it can't call the LifeCycleManager, which would
>>
>> in effect set the "running" state, after reacting on a "DEPLOY_SUCCESS".
>>
>> Therefore i assume, that the corresponding message is never send. But who
>>
>> should send it???
>>
>> The VMware VMM mad is responsible to send back the DEPLOY SUCESS, so
>>
>> it is probably failing to do so. you mentioned in a previous email
>>
>> that the VM is already running, so I guess the driver is crashing
>>
>> badly after performing the powerOn (otherwise it will send the "DEPLOY
>>
>> FAILED " and you would get a "fail" instead of a "boot" in the VM
>>
>> state). Do you see anything in the one_vmm_vmware log file?
>>
>>
>> I temporary fixed the problem by setting the running state manually in the
>>
>> mentioned deploy_action. I hope that someone will finally answer one of my
>>
>> messages. Indeed its my third unanswered? message to this community.
>>
>> Sorry for the delay in my answers, please take into account that this
>>
>> is a best effort support mailing list.
>>
>>
>> We try to use opennebula in our current project, but the focus of our
>>
>> project is not to get software to do what it is used to do. Nevertheless we
>>
>> are software developers and therefore we could also fix and extend the
>>
>> openNebula project if there would be any support from you.
>>
>> That is great news!! Could you please elaborate a bit on what is that
>>
>> you intend to do with the software? We are happy to provide best
>>
>> effort support.
>>
>> Best regards,
>>
>> -Tino
>>
>>
>> Kind regards,
>>
>> Stefan
>>
>>
>>
>>
>> Am 26.01.2010 um 21:41 schrieb Stefan Reichel:
>>
>> Hi OpenNebula team,
>>
>> our developer team tried to use OpenNebula and now we are able to start vms
>>
>> in VMWare. But we have a serious problem. Every VM we start stays in the
>>
>> "boot" state. What is the reason for that, where ca we gather more
>>
>> information about the problem? We use OpenNebula 1.4 / SVN version on Ubuntu
>>
>> 9.10 in combination with VMWare Server 2. Any help would be appreciated.
>>
>> Sincerely
>>
>> Stefan
>>
>>
>>
>>
>> _______________________________________________
>>
>> Users mailing list
>>
>> Users at lists.opennebula.org
>>
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>>
>>
>>
>>
>> --
>>
>> Constantino Vázquez, Grid & Virtualization Technology
>>
>> Engineer/Researcher: http://www.dsa-research.org/tinova
>>
>> DSA Research Group: http://dsa-research.org
>>
>> Globus GridWay Metascheduler: http://www.GridWay.org
>>
>> OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org
>>
>> _____________________________________
>>
>> Stefan Reichel, M.Sc. Candidate
>>
>> Hasso-Plattner-Institut für Softwaresystemtechnik GmbH
>>
>> Postfach 900460, D-14440 Potsdam, Germany
>>
>> http://www.hpi.uni-potsdam.de
>>
>> Telefon: 03322/206306 Mobile: 0178/5495023
>>
>> Email: stefan.reichel at student.hpi.uni-potsdam.de
>>
>> _____________________________________
>>
>>
>> _____________________________________
>>
>> Stefan Reichel, M.Sc. Candidate
>>
>> Hasso-Plattner-Institut für Softwaresystemtechnik GmbH
>> Postfach 900460, D-14440 Potsdam, Germany
>> http://www.hpi.uni-potsdam.de
>> Telefon: 03322/206306 Mobile: 0178/5495023
>> Email: stefan.reichel at student.hpi.uni-potsdam.de
>> _____________________________________
>>
More information about the Users
mailing list