[one-users] What remotes commands does one 4.6 use:

Steven Timm timm at fnal.gov
Tue Jul 29 07:35:25 PDT 2014


I am still trying to debug a nasty monitoring inconsistency.

-bash-4.1$ onevm list | grep fgtest14
     26 oneadmin oneadmin fgt6x4-26       runn    6      4G fgtest14   117d 19h50
     27 oneadmin oneadmin fgt5x4-27       runn   10      4G fgtest14   117d 17h57
     28 oneadmin oneadmin fgt1x1-28       runn   10    4.1G fgtest14   117d 16h59
     30 oneadmin oneadmin fgt5x1-30       runn    0      4G fgtest14   116d 23h50
     33 oneadmin oneadmin ip6sl5vda-33    runn    6      4G fgtest14   116d 19h57
-bash-4.1$ onehost list
   ID NAME            CLUSTER   RVM      ALLOCATED_CPU      ALLOCATED_MEM 
STAT
    3 fgtest11        ipv6        0       0 / 400 (0%)    0K / 15.7G (0%) on
    4 fgtest12        ipv6        0       0 / 400 (0%)    0K / 15.7G (0%) on
    7 fgtest13        ipv6        0       0 / 800 (0%)    0K / 23.6G (0%) on
    8 fgtest14        ipv6        5       0 / 800 (0%)    0K / 23.6G (0%) on
    9 fgtest20        ipv6        3    300 / 800 (37%)  12G / 31.4G (38%) on
   11 fgtest19        ipv6        0       0 / 800 (0%)    0K / 31.5G (0%) on
-bash-4.1$ onehost show 8
HOST 8 INFORMATION
ID                    : 8
NAME                  : fgtest14
CLUSTER               : ipv6
STATE                 : MONITORED
IM_MAD                : kvm
VM_MAD                : kvm
VN_MAD                : dummy
LAST MONITORING TIME  : 07/29 09:25:45

HOST SHARES
TOTAL MEM             : 23.6G
USED MEM (REAL)       : 876.4M
USED MEM (ALLOCATED)  : 0K
TOTAL CPU             : 800
USED CPU (REAL)       : 0
USED CPU (ALLOCATED)  : 0
RUNNING VMS           : 5

LOCAL SYSTEM DATASTORE #102 CAPACITY
TOTAL:                : 548.8G
USED:                 : 175.3G
FREE:                 : 345.6G

MONITORING INFORMATION
ARCH="x86_64"
CPUSPEED="2992"
HOSTNAME="fgtest14.fnal.gov"
HYPERVISOR="kvm"
MODELNAME="Intel(R) Xeon(R) CPU           E5450  @ 3.00GHz"
NETRX="234844577"
NETTX="21553126"
RESERVED_CPU=""
RESERVED_MEM=""
VERSION="4.6.0"

VIRTUAL MACHINES

     ID USER     GROUP    NAME            STAT UCPU    UMEM HOST 
TIME
     26 oneadmin oneadmin fgt6x4-26       runn    6      4G fgtest14   117d 19h50
     27 oneadmin oneadmin fgt5x4-27       runn   10      4G fgtest14   117d 17h57
     28 oneadmin oneadmin fgt1x1-28       runn   10    4.1G fgtest14   117d 17h00
     30 oneadmin oneadmin fgt5x1-30       runn    0      4G fgtest14   116d 23h50
     33 oneadmin oneadmin ip6sl5vda-33    runn    6      4G fgtest14   116d 19h57
-----------------------------------------------------------------------------------

All of this looks great, right?
Just one problem:  There are no VM's running on fgtest14 and
haven't been for 4 days.

[root at fgtest14 ~]# virsh list
  Id    Name                           State
----------------------------------------------------

[root at fgtest14 ~]#

-------------------------------------------------------------------------
Yet the monitoring reports no errors.

Tue Jul 29 09:28:10 2014 [InM][D]: Host fgtest14 (8) successfully monitored.

-----------------------------------------------------------------------------
At the same time, there is no evidence that ONE is actually trying to or
succeeding to monitor these five vm's yet they are still stuck in "runn"
which means I can't do a onevm restart to restart them.
(the vm images of these 5 vm's are still out there on the VM host and
I would like to save and restart them if I can).

What is the remotes command that ONE4.6 would use to monitor this host?
Can I do it manually and see what output I get?

Are we dealing with some kind of a bug, or just a very confused system?
Any help is appreciated. I have to get this sorted out before
I dare deploy one4.x in production.

Steve Timm


------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm at fnal.gov  http://home.fnal.gov/~timm/
Fermilab Scientific Computing Division, Scientific Computing Services Quad.
Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing


More information about the Users mailing list