[one-users] Error - MAD did not answer INIT command
Carlos Martín Sánchez
cmartin at opennebula.org
Tue Sep 4 03:13:13 PDT 2012
Hi,
As you said, the common cause for the "MAD did not answer INIT command" is
that the timeout is reached because of high load... I can't remember any
other obvious reason.
If you find the same problem again, you could try to execute the following:
$ . /usr/lib/one/mads/one_im_ssh -r 0 -t 15 kvm
INIT
It should return immediately:
INIT SUCCESS - -
Regards
--
Carlos Martín, MSc
Project Engineer
OpenNebula - The Open-source Solution for Data Center Virtualization
www.OpenNebula.org | cmartin at opennebula.org |
@OpenNebula<http://twitter.com/opennebula><cmartin at opennebula.org>
On Wed, Aug 29, 2012 at 10:55 AM, Carlos Jiménez <
cjimenez at eneotecnologia.com> wrote:
> Hello,
>
> The problem seems to be solved. I've just restarted one and
> sunstone-server and those errors have disappeared. Yesterday, I restarted
> both daemons but it didn't solve the problem. However, today it has been
> enough. I suspect this action has not been the reason of the solution. So,
> in order to prevent this errors again, could you give me any clue to locate
> the source of the problem?
> From other messages from people having similar errors, I've read that it
> could be done to the use of a VM as a frontend or because computer was
> having a high load. In this case, all are physical computers and running
> smoothly.
>
> I've configured ONE_MAD_DEBUG = 1 (in /etc/one/defaultrc) in order to
> enable drivers debugging information. Which is the associated log file? Is
> it one of these?
> one_auth_mad.log
> one_datastore.log
> one_hm.log
> one_im_ssh_one_im_exec.log
> one_tm.log
> one_vmm_exec.log
>
>
> Thank you,
>
> Carlos.
>
>
>
> On 08/28/2012 03:22 PM, Carlos Jiménez wrote:
>
> Hello everybody,
>
> I have 4 physical computers: 1 as a FrontEnd with CentOS 6.2 and
> OpenNebula 3.6, two hosts with CentOS 6.2 and qemu-kvm and another as an
> nfs server with a debian based distro.
> I created a VM and I was dealing with it when suddenly, it appeared
> several errors in sunstone web interface continuously.
> After restarting one and sunstone-server daemons, it was no possible to
> open Sunstone web interface so I configured logs in debug mode and checked
> oned.log and sunstone.log and the error "MAD did not answer INIT command"
> repeats often along the log.
>
> I guess the problem might be due to a failure loading vmm_kvm and/or
> im_kvm, so is it possible to manually load those drivers for checking it?
> I've checked that all the computers are performing smoothly (no bottleneck
> on cpu, disk, ram...).
>
>
> Please, could you help me on troubleshooting it?
>
> Thanks is advance.
>
> Carlos.
>
> oned.log:
>
> Tue Aug 28 15:03:37 2012 [ONE][I]: Log level:3
> [0=ERROR,1=WARNING,2=INFO,3=DEBUG]
> Tue Aug 28 15:03:37 2012 [ONE][I]: Checking database version.
> Tue Aug 28 15:03:37 2012 [VMM][I]: Starting Virtual Machine Manager...
> Tue Aug 28 15:03:37 2012 [VMM][I]: Virtual Machine Manager started.
> Tue Aug 28 15:03:37 2012 [LCM][I]: Starting Life-cycle Manager...
> Tue Aug 28 15:03:37 2012 [LCM][I]: Life-cycle Manager started.
> Tue Aug 28 15:03:37 2012 [InM][I]: Starting Information Manager...
> Tue Aug 28 15:03:37 2012 [InM][I]: Information Manager started.
> Tue Aug 28 15:03:37 2012 [TrM][I]: Starting Transfer Manager...
> Tue Aug 28 15:03:37 2012 [TrM][I]: Transfer Manager started.
> Tue Aug 28 15:03:37 2012 [DiM][I]: Starting Dispatch Manager...
> Tue Aug 28 15:03:37 2012 [DiM][I]: Dispatch Manager started.
> Tue Aug 28 15:03:37 2012 [HKM][I]: Starting Hook Manager...
> Tue Aug 28 15:03:37 2012 [HKM][I]: Hook Manager started.
> Tue Aug 28 15:03:37 2012 [AuM][I]: Starting Auth Manager...
> Tue Aug 28 15:03:37 2012 [AuM][I]: Authorization Manager started.
> Tue Aug 28 15:03:37 2012 [ImM][I]: Starting Image Manager...
> Tue Aug 28 15:03:37 2012 [ImM][I]: Image Manager started.
> Tue Aug 28 15:03:37 2012 [ReM][I]: Starting Request Manager...
> Tue Aug 28 15:03:37 2012 [ReM][I]: Request Manager started.
> Tue Aug 28 15:03:37 2012 [ReM][I]: Starting XML-RPC server, port 2633 ...
> Tue Aug 28 15:03:39 2012 [VMM][I]: Loading Virtual Machine Manager drivers.
> Tue Aug 28 15:03:39 2012 [VMM][I]: Loading driver: vmm_kvm (KVM)
> Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT command
> Tue Aug 28 15:03:39 2012 [InM][I]: Loading Information Manager drivers.
> Tue Aug 28 15:03:39 2012 [InM][I]: Loading driver: im_kvm
> Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT command
> Tue Aug 28 15:03:39 2012 [TM][I]: Loading Transfer Manager driver.
> Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT command
> Tue Aug 28 15:03:39 2012 [HKM][I]: Loading Hook Manager driver.
> Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT command
> Tue Aug 28 15:03:39 2012 [ImM][I]: Loading Image Manager driver.
> Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT command
> Tue Aug 28 15:03:39 2012 [AuM][I]: Loading Auth. Manager driver.
> Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT command
> Tue Aug 28 15:03:43 2012 [ReM][D]: UserPoolInfo method invoked
> Tue Aug 28 15:03:43 2012 [AuM][E]: Auth Error: Could not find
> Authorization driver
> Tue Aug 28 15:03:43 2012 [ReM][E]: [UserPoolInfo] User couldn't be
> authenticated, aborting call.
> Tue Aug 28 15:03:54 2012 [ONE][E]: SQL command was: DELETE FROM
> vm_monitoring WHERE last_poll < 1346072632, error: database is locked
> Tue Aug 28 15:03:57 2012 [ONE][E]: SQL command was: DELETE FROM
> host_monitoring WHERE last_mon_time < 1346072632, error: database is locked
> Tue Aug 28 15:03:57 2012 [InM][I]: Monitoring host Host1 (3)
> Tue Aug 28 15:03:57 2012 [InM][E]: Could not find information driver im_kvm
> Tue Aug 28 15:03:57 2012 [VMM][I]: Monitoring VM 31.
> Tue Aug 28 15:03:59 2012 [ONE][E]: SQL command was: REPLACE INTO host_pool
> (oid, name, body, state, last_mon_time, uid, gid, owner_u, group_u,
> other_u) VALUES
> (3,'Host1','<HOST><ID>3</ID><NAME>Host1</NAME><STATE>3</STATE><IM_MAD>im_kvm</IM_MAD><VM_MAD>vmm_kvm</VM_MAD><VN_MAD>dummy</VN_MAD><LAST_MON_TIME>1346151563</LAST_MON_TIME><CLUSTER_ID>100</CLUSTER_ID><CLUSTER>Cluster1</CLUSTER><HOST_SHARE><DISK_USAGE>0</DISK_USAGE><MEM_USAGE>0</MEM_USAGE><CPU_USAGE>0</CPU_USAGE><MAX_DISK>0</MAX_DISK><MAX_MEM>3922016</MAX_MEM><MAX_CPU>800</MAX_CPU><FREE_DISK>0</FREE_DISK><FREE_MEM>3732716</FREE_MEM><FREE_CPU>800</FREE_CPU><USED_DISK>0</USED_DISK><USED_MEM>189300</USED_MEM><USED_CPU>0</USED_CPU><RUNNING_VMS>0</RUNNING_VMS></HOST_SHARE><TEMPLATE><ARCH><![CDATA[x86_64]]></ARCH><CPUSPEED><![CDATA[1861]]></CPUSPEED><F
> REECPU> ;&
> lt;![CDATA[800.0]]></FREECPU><FREEMEMORY><![CDATA[3732716]]></FREEMEMORY><HOSTNAME><![CDATA[host1]]></HOSTNAME><HYPERVISOR><![CDATA[kvm]]></HYPERVISOR><MODELNAME><![CDATA[Intel(R)
> Xeon(R) CPU E5320 @
> 1.86GHz]]></MODELNAME><NETRX><![CDATA[926024]]></NETRX><NETTX><![CDATA[0]]></NETTX><TOTALCPU><![CDATA[800]]></TOTALCPU><TOTALMEMORY><![CDATA[3922016]]></TOTALMEMORY><USEDCPU><![CDATA[0.0]]></USEDCPU><USEDMEMORY><![CDATA[189300]]></USEDMEMORY></TEMPLATE></HOST>',3,1346151563,0,0,1,0,0),
> error: database is locked
>
> Tue Aug 28 15:03:59 2012 [InM][I]: Monitoring host Host2 (4)
> Tue Aug 28 15:03:59 2012 [InM][E]: Could not find information driver im_kvm
>
>
>
> Sunstone-server log:
>
> [oneadmin at frontend etc]$ cat /var/log/one/sunstone.log
> --------------------------------------
> Server configuration
> --------------------------------------
> {:vnc_proxy_cert=>nil,
> :tmpdir=>"/var/tmp/one",
> :vnc_proxy_path=>"/usr/share/one/noVNC/utils/websockify",
> :marketplace_url=>"https://marketplace.c12g.com/appliance"<https://marketplace.c12g.com/appliance>
> ,
> :vnc_proxy_key=>nil,
> :lang=>"en_US",
> :auth=>"sunstone",
> :vnc_proxy_base_port=>29876,
> :vnc_proxy_support_wss=>false,
> :debug_level=>3,
> :host=>"0.0.0.0",
> :one_xmlrpc=>"http://localhost:2633/RPC2" <http://localhost:2633/RPC2>,
> :core_auth=>"cipher",
> :port=>9869}
> Tue Aug 28 15:03:43 2012 [E]: Error initializing authentication system
> Tue Aug 28 15:03:43 2012 [E]: [UserPoolInfo] User couldn't be
> authenticated, aborting call.
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20120904/59da8cec/attachment-0001.htm>
More information about the Users
mailing list