[one-users] Host monitoring with snmp ?

Neil Mooney neil.mooney at sara.nl
Wed Apr 21 00:19:39 PDT 2010


Excellent ,  I will package up the snmp information manager scripts and
post them to the ecosystem.

Regarding the monitoring interval, I am a little confused ... perhaps
there is a reason but I cant fathom it , it seems if you don't have
multiples of 10 nodes then your monitoring interval will grow ...
 

**((2 nodes mod 10) + 1) * 10 = 30 second interval**

**((10 nodes mod 10) + 1) * 10 = 10** second interval****

**((26 nodes mod 10) + 1) * 10 = 70** second interval

******((260 nodes mod 10) + 1) * 10 = 10 second interval
**
Why should it be quicker to monitor 10 or 260 hosts rather than 2 hosts ?

Cheers

Neil


Ruben S. Montero wrote:
> Hi
>
> This is a great idea indeed, and a great contribution for the
> ecosystem ;) . Just a few thoughts about monitoring...
>
> The MONITORING_INTERVAL is the time between to monitor actions in
> OpenNebula. Each monitor action takes the ONLY THE LAST 10 hosts
> monitored by the system. So if you have more than 10 hosts the hosts
> are going to be monitored in approx ((num_hosts mod 10) + 1) *
> monitoring_interval.
>
> If that is not your case, please send us the output of oned.log...
>
>
> Cheers
>
> Ruben
>
> On Thu, Apr 15, 2010 at 10:57 AM, Neil Mooney <neil.mooney at sara.nl> wrote:
>   
>> Hi All,
>>
>> I had some problems with host monitoring, namely that I could not
>> decrease the cycle time to monitor a host.
>> It is my understanding that we might need to monitor our hosts more
>> often to get better / fairer scheduling.
>>
>> I changed HOST_MONITORING_INTERVAL setting to 10 seconds, but I could
>> not realise a cycle time of less than 30 seconds...
>>
>> oneadmin at node15-one:~$ grep INTERVAL /etc/one/oned.conf
>> #  HOST_MONITORING_INTERVAL: Time in seconds between host monitorization
>> #  VM_POLLING_INTERVAL: Time in seconds between virtual machine
>> monitorization
>> HOST_MONITORING_INTERVAL = 10
>> VM_POLLING_INTERVAL      = 60
>>
>> I wrote a script that tries to emulate the ruby monitoring script, but
>> pulls the information directly from system snmp counters to
>> This should mean a more real time status of each host and avoids parsing
>> out, scp and ssh of the ruby script.
>>
>> Is host monitoring via snmp a good idea ? Perhaps its  faster / more
>> scalable ?
>>
>> Example output using the ruby script from ONE:
>>
>> oneadmin at node15-one:~$ time
>> /tmp/one-im/one_im-7a0979c4d3d29cded6cdb99498449870
>> HYPERVISOR=kvm
>> TOTALCPU=800
>> CPUSPEED=2261
>> TOTALMEMORY=24735424
>> USEDMEMORY=10151572
>> FREEMEMORY=24159416
>> FREECPU=770.4
>> USEDCPU=29.6
>> NETRX=18837297726
>> NETTX=328653416717
>>
>> Example output from my snmp script:
>>
>> real    0m3.531s
>> user    0m0.000s
>> sys     0m0.028s
>> oneadmin at node15-one:~$ time scripts/onemonitor.sh node16-one
>> HYPERVISOR=kvm
>> TOTALCPU=800
>> CPUSPEED=2270
>> TOTALMEMORY=24735424
>> USEDMEMORY=18889772
>> FREEMEMORY=5845652
>> FREECPU=0.00
>> USEDCPU=800.00
>> NETRX=513353459
>> NETTX=3422880146
>>
>> real    0m0.039s
>> user    0m0.012s
>> sys     0m0.048s
>> oneadmin at node15-one:~$ time scripts/onemonitor.sh localhost
>> HYPERVISOR=kvm
>> TOTALCPU=800
>> CPUSPEED=2270
>> TOTALMEMORY=24735424
>> USEDMEMORY=9011744
>> FREEMEMORY=15723680
>> FREECPU=792.00
>> USEDCPU=8.00
>> NETRX=513356316
>> NETTX=3422883584
>>
>> real    0m0.039s
>> user    0m0.020s
>> sys     0m0.040s
>> oneadmin at node15-one:~$
>>
>> Cheers
>>
>> Neil
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>>     
>
>
>
>   



More information about the Users mailing list