Hi,<div><br></div>As you said, the common cause for the "MAD did not answer INIT command" is that the timeout is reached because of high load... I can't remember any other obvious reason.<div><br></div><div><br>
</div><div>If you find the same problem again, you could try to execute the following:</div><div><br></div><div><div>$ . /usr/lib/one/mads/one_im_ssh -r 0 -t 15 kvm</div><div>INIT</div><div><div><br></div><div>It should return immediately:</div>
<div>INIT SUCCESS - -</div><div><br></div><div><br></div><div>Regards</div><div><div><div>--<br>Carlos Martín, MSc<br>Project Engineer<br>OpenNebula - The Open-source Solution for Data Center Virtualization<div><span style="border-collapse:collapse;color:rgb(136,136,136);font-family:arial,sans-serif;font-size:13px"><a href="http://www.OpenNebula.org" target="_blank">www.OpenNebula.org</a> | <a href="mailto:cmartin@opennebula.org" target="_blank">cmartin@opennebula.org</a> | <a href="http://twitter.com/opennebula" target="_blank">@OpenNebula</a></span><span style="border-collapse:collapse;color:rgb(136,136,136);font-family:arial,sans-serif;font-size:13px"><a href="mailto:cmartin@opennebula.org" style="color:rgb(42,93,176)" target="_blank"></a></span></div>
<br>
<br><br><div class="gmail_quote">On Wed, Aug 29, 2012 at 10:55 AM, Carlos Jiménez <span dir="ltr"><<a href="mailto:cjimenez@eneotecnologia.com" target="_blank">cjimenez@eneotecnologia.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Hello,<br>
<br>
The problem seems to be solved. I've just restarted one and
sunstone-server and those errors have disappeared. Yesterday, I
restarted both daemons but it didn't solve the problem. However,
today it has been enough. I suspect this action has not been the
reason of the solution. So, in order to prevent this errors again,
could you give me any clue to locate the source of the problem?<br>
From other messages from people having similar errors, I've read
that it could be done to the use of a VM as a frontend or because
computer was having a high load. In this case, all are physical
computers and running smoothly.<br>
<br>
I've configured ONE_MAD_DEBUG = 1 (in /etc/one/defaultrc) in order
to enable drivers debugging information. Which is the associated log
file? Is it one of these?<br>
one_auth_mad.log<br>
one_datastore.log<br>
one_hm.log<br>
one_im_ssh_one_im_exec.log<br>
one_tm.log<br>
one_vmm_exec.log<br>
<br>
<br>
Thank you,<br>
<br>
Carlos.<div><div class="h5"><br>
<br>
<br>
<div>On 08/28/2012 03:22 PM, Carlos Jiménez
wrote:<br>
</div>
</div></div><blockquote type="cite"><div><div class="h5">
Hello everybody,<br>
<br>
I have 4 physical computers: 1 as a FrontEnd with CentOS 6.2 and
OpenNebula 3.6, two hosts with CentOS 6.2 and qemu-kvm and another
as an nfs server with a debian based distro.<br>
I created a VM and I was dealing with it when suddenly, it
appeared several errors in sunstone web interface continuously.<br>
After restarting one and sunstone-server daemons, it was no
possible to open Sunstone web interface so I configured logs in
debug mode and checked oned.log and sunstone.log and the error
"MAD did not answer INIT command" repeats often along the log.<br>
<br>
I guess the problem might be due to a failure loading vmm_kvm
and/or im_kvm, so is it possible to manually load those drivers
for checking it?<br>
I've checked that all the computers are performing smoothly (no
bottleneck on cpu, disk, ram...).<br>
<br>
<br>
Please, could you help me on troubleshooting it?<br>
<br>
Thanks is advance.<br>
<br>
Carlos.<br>
<br>
oned.log:<br>
<br>
</div></div><small><font face="Helvetica, Arial, sans-serif"><div><div class="h5">Tue Aug 28
15:03:37 2012 [ONE][I]: Log level:3
[0=ERROR,1=WARNING,2=INFO,3=DEBUG]<br>
Tue Aug 28 15:03:37 2012 [ONE][I]: Checking database version.<br>
Tue Aug 28 15:03:37 2012 [VMM][I]: Starting Virtual Machine
Manager...<br>
Tue Aug 28 15:03:37 2012 [VMM][I]: Virtual Machine Manager
started.<br>
Tue Aug 28 15:03:37 2012 [LCM][I]: Starting Life-cycle
Manager...<br>
Tue Aug 28 15:03:37 2012 [LCM][I]: Life-cycle Manager started.<br>
Tue Aug 28 15:03:37 2012 [InM][I]: Starting Information
Manager...<br>
Tue Aug 28 15:03:37 2012 [InM][I]: Information Manager
started.<br>
Tue Aug 28 15:03:37 2012 [TrM][I]: Starting Transfer
Manager...<br>
Tue Aug 28 15:03:37 2012 [TrM][I]: Transfer Manager started.<br>
Tue Aug 28 15:03:37 2012 [DiM][I]: Starting Dispatch
Manager...<br>
Tue Aug 28 15:03:37 2012 [DiM][I]: Dispatch Manager started.<br>
Tue Aug 28 15:03:37 2012 [HKM][I]: Starting Hook Manager...<br>
Tue Aug 28 15:03:37 2012 [HKM][I]: Hook Manager started.<br>
Tue Aug 28 15:03:37 2012 [AuM][I]: Starting Auth Manager...<br>
Tue Aug 28 15:03:37 2012 [AuM][I]: Authorization Manager
started.<br>
Tue Aug 28 15:03:37 2012 [ImM][I]: Starting Image Manager...<br>
Tue Aug 28 15:03:37 2012 [ImM][I]: Image Manager started.<br>
Tue Aug 28 15:03:37 2012 [ReM][I]: Starting Request Manager...<br>
Tue Aug 28 15:03:37 2012 [ReM][I]: Request Manager started.<br>
Tue Aug 28 15:03:37 2012 [ReM][I]: Starting XML-RPC server,
port 2633 ...<br>
Tue Aug 28 15:03:39 2012 [VMM][I]: Loading Virtual Machine
Manager drivers.<br>
Tue Aug 28 15:03:39 2012 [VMM][I]: Loading driver: vmm_kvm
(KVM)<br>
Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT
command<br>
Tue Aug 28 15:03:39 2012 [InM][I]: Loading Information Manager
drivers.<br>
Tue Aug 28 15:03:39 2012 [InM][I]: Loading driver: im_kvm<br>
Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT
command<br>
Tue Aug 28 15:03:39 2012 [TM][I]: Loading Transfer Manager
driver.<br>
Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT
command<br>
Tue Aug 28 15:03:39 2012 [HKM][I]: Loading Hook Manager
driver.<br>
Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT
command<br>
Tue Aug 28 15:03:39 2012 [ImM][I]: Loading Image Manager
driver.<br>
Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT
command<br>
Tue Aug 28 15:03:39 2012 [AuM][I]: Loading Auth. Manager
driver.<br>
Tue Aug 28 15:03:39 2012 [MAD][E]: MAD did not answer INIT
command<br>
Tue Aug 28 15:03:43 2012 [ReM][D]: UserPoolInfo method invoked<br>
Tue Aug 28 15:03:43 2012 [AuM][E]: Auth Error: Could not find
Authorization driver<br>
Tue Aug 28 15:03:43 2012 [ReM][E]: [UserPoolInfo] User
couldn't be authenticated, aborting call.<br>
Tue Aug 28 15:03:54 2012 [ONE][E]: SQL command was: DELETE
FROM vm_monitoring WHERE last_poll < 1346072632, error:
database is locked<br>
Tue Aug 28 15:03:57 2012 [ONE][E]: SQL command was: DELETE
FROM host_monitoring WHERE last_mon_time < 1346072632,
error: database is locked<br>
Tue Aug 28 15:03:57 2012 [InM][I]: Monitoring host Host1 (3)<br>
Tue Aug 28 15:03:57 2012 [InM][E]: Could not find information
driver im_kvm<br>
Tue Aug 28 15:03:57 2012 [VMM][I]: Monitoring VM 31.<br></div></div>
Tue Aug 28 15:03:59 2012 [ONE][E]: SQL command was: REPLACE
INTO host_pool (oid, name, body, state, last_mon_time, uid,
gid, owner_u, group_u, other_u) VALUES
(3,'Host1','<HOST><ID>3</ID><NAME>Host1</NAME><STATE>3</STATE><IM_MAD>im_kvm</IM_MAD><VM_MAD>vmm_kvm</VM_MAD><VN_MAD>dummy</VN_MAD><LAST_MON_TIME>1346151563</LAST_MON_TIME><CLUSTER_ID>100</CLUSTER_ID><CLUSTER>Cluster1</CLUSTER><HOST_SHARE><DISK_USAGE>0</DISK_USAGE><MEM_USAGE>0</MEM_USAGE><CPU_USAGE>0</CPU_USAGE><MAX_DISK>0</MAX_DISK><MAX_MEM>3922016</MAX_MEM><MAX_CPU>800</MAX_CPU><FREE_DISK>0</FREE_DISK><FREE_MEM>3732716</FREE_MEM><FREE_CPU>800</FREE_CPU><USED_DISK>0</USED_DISK><USED_MEM>189300</USED_MEM><USED_CPU>0</USED_CPU><RUNNING_VMS>0</RUNNING_VMS></HOST_SHARE><TEMPLATE><ARCH><![CDATA[x86_64]]></ARCH><CPUSPEED><![CDATA[1861]]></CPUSPEED><F
REECPU>
;&
lt;![CDATA[800.0]]></FREECPU><FREEMEMORY><![CDATA[3732716]]></FREEMEMORY><HOSTNAME><![CDATA[host1]]></HOSTNAME><HYPERVISOR><![CDATA[kvm]]></HYPERVISOR><MODELNAME><![CDATA[Intel(R)
Xeon(R) CPU E5320 @
1.86GHz]]></MODELNAME><NETRX><![CDATA[926024]]></NETRX><NETTX><![CDATA[0]]></NETTX><TOTALCPU><![CDATA[800]]></TOTALCPU><TOTALMEMORY><![CDATA[3922016]]></TOTALMEMORY><USEDCPU><![CDATA[0.0]]></USEDCPU><USEDMEMORY><![CDATA[189300]]></USEDMEMORY></TEMPLATE></HOST>',3,1346151563,0,0,1,0,0),
error: database is locked<div class="im"><br>
Tue Aug 28 15:03:59 2012 [InM][I]: Monitoring host Host2 (4)<br>
Tue Aug 28 15:03:59 2012 [InM][E]: Could not find information
driver im_kvm<br>
</div></font></small><div class="im"><br>
<br>
<br>
Sunstone-server log:<br>
<br>
<small><font face="Helvetica, Arial, sans-serif">[oneadmin@frontend
etc]$ cat /var/log/one/sunstone.log <br>
--------------------------------------<br>
Server configuration <br>
--------------------------------------<br>
{:vnc_proxy_cert=>nil,<br>
:tmpdir=>"/var/tmp/one",<br>
:vnc_proxy_path=>"/usr/share/one/noVNC/utils/websockify",<br>
:marketplace_url=><a href="https://marketplace.c12g.com/appliance" target="_blank">"https://marketplace.c12g.com/appliance"</a>,<br>
:vnc_proxy_key=>nil,<br>
:lang=>"en_US",<br>
:auth=>"sunstone",<br>
:vnc_proxy_base_port=>29876,<br>
:vnc_proxy_support_wss=>false,<br>
:debug_level=>3,<br>
:host=>"0.0.0.0",<br>
:one_xmlrpc=><a href="http://localhost:2633/RPC2" target="_blank">"http://localhost:2633/RPC2"</a>,<br>
:core_auth=>"cipher",<br>
:port=>9869}<br>
Tue Aug 28 15:03:43 2012 [E]: Error initializing
authentication system<br>
Tue Aug 28 15:03:43 2012 [E]: [UserPoolInfo] User couldn't be
authenticated, aborting call.<br>
</font></small><br>
</div></blockquote>
<br>
</div>
<br>_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@lists.opennebula.org">Users@lists.opennebula.org</a><br>
<a href="http://lists.opennebula.org/listinfo.cgi/users-opennebula.org" target="_blank">http://lists.opennebula.org/listinfo.cgi/users-opennebula.org</a><br>
<br></blockquote></div><br></div></div></div></div></div>