[one-users] What remotes commands does one 4.6 use:

Ruben S. Montero rsmontero at opennebula.org
Wed Jul 30 07:45:28 PDT 2014


Hi,

1.- monitor_ds.sh may use LVM commands (vgdisplay) that needs sudo access.
It should be automatically setup by the opennebula node packages.

2.- It is not a real daemon, the first time a host is monitored a process
is left to periodically send information. OpenNebula restarts it if no
information is received in 3 monitor steps. Nothing needs to be set up...

Cheers


On Wed, Jul 30, 2014 at 3:50 PM, Steven Timm <timm at fnal.gov> wrote:

> On Wed, 30 Jul 2014, Ruben S. Montero wrote:
>
>
>> Maybe you could try to execute the  monitor probes in the node,
>>
>> 1. ssh the node
>> 2. Go to /var/tmp/one/im
>> 3. Execute run_probes kvm-probes
>>
>
> When I do that, (using sh -x ) I get the following:
>
> -bash-4.1$ sh -x ./run_probes kvm-probes
> ++ dirname ./run_probes
> + source ./../scripts_common.sh
> ++ export LANG=C
> ++ LANG=C
> ++ export PATH=/bin:/sbin:/usr/bin:/usr/krb5/bin:/usr/lib64/qt-3.3/
> bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
> ++ PATH=/bin:/sbin:/usr/bin:/usr/krb5/bin:/usr/lib64/qt-3.3/
> bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
> ++ AWK=awk
> ++ BASH=bash
> ++ CUT=cut
> ++ DATE=date
> ++ DD=dd
> ++ DF=df
> ++ DU=du
> ++ GREP=grep
> ++ ISCSIADM=iscsiadm
> ++ LVCREATE=lvcreate
> ++ LVREMOVE=lvremove
> ++ LVRENAME=lvrename
> ++ LVS=lvs
> ++ LN=ln
> ++ MD5SUM=md5sum
> ++ MKFS=mkfs
> ++ MKISOFS=genisoimage
> ++ MKSWAP=mkswap
> ++ QEMU_IMG=qemu-img
> ++ RADOS=rados
> ++ RBD=rbd
> ++ READLINK=readlink
> ++ RM=rm
> ++ SCP=scp
> ++ SED=sed
> ++ SSH=ssh
> ++ SUDO=sudo
> ++ SYNC=sync
> ++ TAR=tar
> ++ TGTADM=tgtadm
> ++ TGTADMIN=tgt-admin
> ++ TGTSETUPLUN=tgt-setup-lun-one
> ++ TR=tr
> ++ VGDISPLAY=vgdisplay
> ++ VMKFSTOOLS=vmkfstools
> ++ WGET=wget
> +++ uname -s
> ++ '[' xLinux = xLinux ']'
> ++ SED='sed -r'
> +++ basename ./run_probes
> ++ SCRIPT_NAME=run_probes
> + export LANG=C
> + LANG=C
> + HYPERVISOR_DIR=kvm-probes.d
> + ARGUMENTS=kvm-probes
> ++ dirname ./run_probes
> + SCRIPTS_DIR=.
> + cd .
> ++ '[' -d kvm-probes.d ']'
> ++ run_dir kvm-probes.d
> ++ cd kvm-probes.d
> +++ ls architecture.sh collectd-client-shepherd.sh cpu.sh kvm.rb
> monitor_ds.sh name.sh poll.sh version.sh
> ++ for i in '`ls *`'
> ++ '[' -x architecture.sh ']'
> ++ ./architecture.sh kvm-probes
> ++ EXIT_CODE=0
> ++ '[' x0 '!=' x0 ']'
> ++ for i in '`ls *`'
> ++ '[' -x collectd-client-shepherd.sh ']'
> ++ ./collectd-client-shepherd.sh kvm-probes
> ++ EXIT_CODE=0
> ++ '[' x0 '!=' x0 ']'
> ++ for i in '`ls *`'
> ++ '[' -x cpu.sh ']'
> ++ ./cpu.sh kvm-probes
> ++ EXIT_CODE=0
> ++ '[' x0 '!=' x0 ']'
> ++ for i in '`ls *`'
> ++ '[' -x kvm.rb ']'
> ++ ./kvm.rb kvm-probes
> ++ EXIT_CODE=0
> ++ '[' x0 '!=' x0 ']'
> ++ for i in '`ls *`'
> ++ '[' -x monitor_ds.sh ']'
> ++ ./monitor_ds.sh kvm-probes
> [sudo] password for oneadmin:
>
> and it stays hung on the password for oneadmin.
>
> What's going on?
>
> Also, you mentioned a collectd--are you saying that OpenNebula 4.6 now
> needs to run a daemon on every single VM host?  Where is it documented
> on how to set it up?
>
> Steve
>
>
>
>
>
>
>
>> Make sure you do not have a host using the same hostname fgtest14 and
>> running a  collectd process
>>
>> On Jul 29, 2014 4:35 PM, "Steven Timm" <timm at fnal.gov> wrote:
>>
>>       I am still trying to debug a nasty monitoring inconsistency.
>>
>>       -bash-4.1$ onevm list | grep fgtest14
>>           26 oneadmin oneadmin fgt6x4-26       runn    6      4G fgtest14
>>   117d 19h50
>>           27 oneadmin oneadmin fgt5x4-27       runn   10      4G fgtest14
>>   117d 17h57
>>           28 oneadmin oneadmin fgt1x1-28       runn   10    4.1G fgtest14
>>   117d 16h59
>>           30 oneadmin oneadmin fgt5x1-30       runn    0      4G fgtest14
>>   116d 23h50
>>           33 oneadmin oneadmin ip6sl5vda-33    runn    6      4G fgtest14
>>   116d 19h57
>>       -bash-4.1$ onehost list
>>         ID NAME            CLUSTER   RVM      ALLOCATED_CPU
>>  ALLOCATED_MEM STAT
>>          3 fgtest11        ipv6        0       0 / 400 (0%)    0K / 15.7G
>> (0%) on
>>          4 fgtest12        ipv6        0       0 / 400 (0%)    0K / 15.7G
>> (0%) on
>>          7 fgtest13        ipv6        0       0 / 800 (0%)    0K / 23.6G
>> (0%) on
>>          8 fgtest14        ipv6        5       0 / 800 (0%)    0K / 23.6G
>> (0%) on
>>          9 fgtest20        ipv6        3    300 / 800 (37%)  12G / 31.4G
>> (38%) on
>>         11 fgtest19        ipv6        0       0 / 800 (0%)    0K / 31.5G
>> (0%) on
>>       -bash-4.1$ onehost show 8
>>       HOST 8 INFORMATION
>>       ID                    : 8
>>       NAME                  : fgtest14
>>       CLUSTER               : ipv6
>>       STATE                 : MONITORED
>>       IM_MAD                : kvm
>>       VM_MAD                : kvm
>>       VN_MAD                : dummy
>>       LAST MONITORING TIME  : 07/29 09:25:45
>>
>>       HOST SHARES
>>       TOTAL MEM             : 23.6G
>>       USED MEM (REAL)       : 876.4M
>>       USED MEM (ALLOCATED)  : 0K
>>       TOTAL CPU             : 800
>>       USED CPU (REAL)       : 0
>>       USED CPU (ALLOCATED)  : 0
>>       RUNNING VMS           : 5
>>
>>       LOCAL SYSTEM DATASTORE #102 CAPACITY
>>       TOTAL:                : 548.8G
>>       USED:                 : 175.3G
>>       FREE:                 : 345.6G
>>
>>       MONITORING INFORMATION
>>       ARCH="x86_64"
>>       CPUSPEED="2992"
>>       HOSTNAME="fgtest14.fnal.gov"
>>       HYPERVISOR="kvm"
>>       MODELNAME="Intel(R) Xeon(R) CPU           E5450  @ 3.00GHz"
>>       NETRX="234844577"
>>       NETTX="21553126"
>>       RESERVED_CPU=""
>>       RESERVED_MEM=""
>>       VERSION="4.6.0"
>>
>>       VIRTUAL MACHINES
>>
>>           ID USER     GROUP    NAME            STAT UCPU    UMEM HOST TIME
>>           26 oneadmin oneadmin fgt6x4-26       runn    6      4G fgtest14
>>   117d 19h50
>>           27 oneadmin oneadmin fgt5x4-27       runn   10      4G fgtest14
>>   117d 17h57
>>           28 oneadmin oneadmin fgt1x1-28       runn   10    4.1G fgtest14
>>   117d 17h00
>>           30 oneadmin oneadmin fgt5x1-30       runn    0      4G fgtest14
>>   116d 23h50
>>           33 oneadmin oneadmin ip6sl5vda-33    runn    6      4G fgtest14
>>   116d 19h57
>>       ------------------------------------------------------------
>> -----------------------
>>
>>       All of this looks great, right?
>>       Just one problem:  There are no VM's running on fgtest14 and
>>       haven't been for 4 days.
>>
>>       [root at fgtest14 ~]# virsh list
>>        Id    Name                           State
>>       ----------------------------------------------------
>>
>>       [root at fgtest14 ~]#
>>
>>       ------------------------------------------------------------
>> -------------
>>       Yet the monitoring reports no errors.
>>
>>       Tue Jul 29 09:28:10 2014 [InM][D]: Host fgtest14 (8) successfully
>> monitored.
>>
>>       ------------------------------------------------------------
>> -----------------
>>       At the same time, there is no evidence that ONE is actually trying
>> to or
>>       succeeding to monitor these five vm's yet they are still stuck in
>> "runn"
>>       which means I can't do a onevm restart to restart them.
>>       (the vm images of these 5 vm's are still out there on the VM host
>> and
>>       I would like to save and restart them if I can).
>>
>>       What is the remotes command that ONE4.6 would use to monitor this
>> host?
>>       Can I do it manually and see what output I get?
>>
>>       Are we dealing with some kind of a bug, or just a very confused
>> system?
>>       Any help is appreciated. I have to get this sorted out before
>>       I dare deploy one4.x in production.
>>
>>       Steve Timm
>>
>>
>>       ------------------------------------------------------------------
>>       Steven C. Timm, Ph.D  (630) 840-8525
>>       timm at fnal.gov  http://home.fnal.gov/~timm/
>>       Fermilab Scientific Computing Division, Scientific Computing
>> Services Quad.
>>       Grid and Cloud Services Dept., Associate Dept. Head for Cloud
>> Computing
>>       _______________________________________________
>>       Users mailing list
>>       Users at lists.opennebula.org
>>       http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>>
>>
>>
> ------------------------------------------------------------------
> Steven C. Timm, Ph.D  (630) 840-8525
> timm at fnal.gov  http://home.fnal.gov/~timm/
> Fermilab Scientific Computing Division, Scientific Computing Services Quad.
> Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing




-- 
-- 
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - Flexible Enterprise Cloud Made Simple
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20140730/5ae763f3/attachment.htm>


More information about the Users mailing list