[one-users] What remotes commands does one 4.6 use:
Steven Timm
timm at fnal.gov
Tue Jul 29 07:35:25 PDT 2014
I am still trying to debug a nasty monitoring inconsistency.
-bash-4.1$ onevm list | grep fgtest14
26 oneadmin oneadmin fgt6x4-26 runn 6 4G fgtest14 117d 19h50
27 oneadmin oneadmin fgt5x4-27 runn 10 4G fgtest14 117d 17h57
28 oneadmin oneadmin fgt1x1-28 runn 10 4.1G fgtest14 117d 16h59
30 oneadmin oneadmin fgt5x1-30 runn 0 4G fgtest14 116d 23h50
33 oneadmin oneadmin ip6sl5vda-33 runn 6 4G fgtest14 116d 19h57
-bash-4.1$ onehost list
ID NAME CLUSTER RVM ALLOCATED_CPU ALLOCATED_MEM
STAT
3 fgtest11 ipv6 0 0 / 400 (0%) 0K / 15.7G (0%) on
4 fgtest12 ipv6 0 0 / 400 (0%) 0K / 15.7G (0%) on
7 fgtest13 ipv6 0 0 / 800 (0%) 0K / 23.6G (0%) on
8 fgtest14 ipv6 5 0 / 800 (0%) 0K / 23.6G (0%) on
9 fgtest20 ipv6 3 300 / 800 (37%) 12G / 31.4G (38%) on
11 fgtest19 ipv6 0 0 / 800 (0%) 0K / 31.5G (0%) on
-bash-4.1$ onehost show 8
HOST 8 INFORMATION
ID : 8
NAME : fgtest14
CLUSTER : ipv6
STATE : MONITORED
IM_MAD : kvm
VM_MAD : kvm
VN_MAD : dummy
LAST MONITORING TIME : 07/29 09:25:45
HOST SHARES
TOTAL MEM : 23.6G
USED MEM (REAL) : 876.4M
USED MEM (ALLOCATED) : 0K
TOTAL CPU : 800
USED CPU (REAL) : 0
USED CPU (ALLOCATED) : 0
RUNNING VMS : 5
LOCAL SYSTEM DATASTORE #102 CAPACITY
TOTAL: : 548.8G
USED: : 175.3G
FREE: : 345.6G
MONITORING INFORMATION
ARCH="x86_64"
CPUSPEED="2992"
HOSTNAME="fgtest14.fnal.gov"
HYPERVISOR="kvm"
MODELNAME="Intel(R) Xeon(R) CPU E5450 @ 3.00GHz"
NETRX="234844577"
NETTX="21553126"
RESERVED_CPU=""
RESERVED_MEM=""
VERSION="4.6.0"
VIRTUAL MACHINES
ID USER GROUP NAME STAT UCPU UMEM HOST
TIME
26 oneadmin oneadmin fgt6x4-26 runn 6 4G fgtest14 117d 19h50
27 oneadmin oneadmin fgt5x4-27 runn 10 4G fgtest14 117d 17h57
28 oneadmin oneadmin fgt1x1-28 runn 10 4.1G fgtest14 117d 17h00
30 oneadmin oneadmin fgt5x1-30 runn 0 4G fgtest14 116d 23h50
33 oneadmin oneadmin ip6sl5vda-33 runn 6 4G fgtest14 116d 19h57
-----------------------------------------------------------------------------------
All of this looks great, right?
Just one problem: There are no VM's running on fgtest14 and
haven't been for 4 days.
[root at fgtest14 ~]# virsh list
Id Name State
----------------------------------------------------
[root at fgtest14 ~]#
-------------------------------------------------------------------------
Yet the monitoring reports no errors.
Tue Jul 29 09:28:10 2014 [InM][D]: Host fgtest14 (8) successfully monitored.
-----------------------------------------------------------------------------
At the same time, there is no evidence that ONE is actually trying to or
succeeding to monitor these five vm's yet they are still stuck in "runn"
which means I can't do a onevm restart to restart them.
(the vm images of these 5 vm's are still out there on the VM host and
I would like to save and restart them if I can).
What is the remotes command that ONE4.6 would use to monitor this host?
Can I do it manually and see what output I get?
Are we dealing with some kind of a bug, or just a very confused system?
Any help is appreciated. I have to get this sorted out before
I dare deploy one4.x in production.
Steve Timm
------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525
timm at fnal.gov http://home.fnal.gov/~timm/
Fermilab Scientific Computing Division, Scientific Computing Services Quad.
Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing
More information about the Users
mailing list