[one-users] ONE 4.8 + Gluster-3.4.5 on Centos 6.5 -- VMs stuck in BOOT

Wed Sep 10 03:09:17 PDT 2014

Hi, thanks for the answer.

The hypervisor hosts can telnet both clu100 and clu001 on port 24007

I'm sorry, I'm a OpenNebual newbie, so it's not clear to me where to
put the DISK_TYPE="GLUSTER" option
So I've tired to put the option first in the image attributes, then in
the template custom tags, but nothing has changed
I don't get any message related to Gluster in logs

I'm scratching my head.
Any help is very appreciated. Thanks!

Wed Sep 10 11:49:22 2014 [Z0][DiM][I]: New VM state is ACTIVE.
Wed Sep 10 11:49:22 2014 [Z0][LCM][I]: New VM state is PROLOG.
Wed Sep 10 11:49:46 2014 [Z0][LCM][I]: New VM state is BOOT
Wed Sep 10 11:49:46 2014 [Z0][VMM][I]: Generating deployment file:
/var/lib/one/vms/53/deployment.0
Wed Sep 10 11:49:47 2014 [Z0][VMM][I]: ExitCode: 0
Wed Sep 10 11:49:47 2014 [Z0][VMM][I]: Successfully execute network
driver operation: pre.

The deployment file:

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
        <name>one-53</name>
        <cputune>
                <shares>1024</shares>
        </cputune>
        <memory>786432</memory>
        <os>
                <type arch='x86_64'>hvm</type>
                <boot dev='hd'/>
        </os>
        <devices>
                <emulator>/usr/libexec/qemu-kvm</emulator>
                <disk type='network' device='disk'>
                        <source protocol='gluster' name='sys-one/53/disk.0'>
                                <host name='clu100' port='24007'
transport='tcp'/>
                        </source>
                        <target dev='hda'/>
                        <driver name='qemu' type='qcow2' cache='none'/>
                </disk>
                <disk type='file' device='cdrom'>
                        <source file='/var/lib/one//datastores/110/53/disk.1'/>
                        <target dev='hdb'/>
                        <readonly/>
                        <driver name='qemu' type='raw'/>
                </disk>
                <interface type='bridge'>
                        <source bridge='br0'/>
                        <mac address='02:00:c0:a8:16:7e'/>
                </interface>
                <graphics type='vnc' listen='0.0.0.0' port='5953'/>
        </devices>
        <features>
                <acpi/>
        </features>
</domain>

oneimage show:

IMAGE 19 INFORMATION
ID             : 19
NAME           : CentOS-6.5-one-4.8_GLUSTER
USER           : oneadmin
GROUP          : oneadmin
DATASTORE      : GLUSTER
TYPE           : OS
REGISTER TIME  : 08/30 10:41:12
PERSISTENT     : No
SOURCE         : /var/lib/one//datastores/108/c6780b5c1667ec829b9ed92f7853f934
PATH           :
http://marketplace.c12g.com/appliance/53e767ba8fb81d6a69000001/download/0
SIZE           : 10G
STATE          : used
RUNNING_VMS    : 1

PERMISSIONS
OWNER          : um-
GROUP          : ---
OTHER          : ---

IMAGE TEMPLATE
DEV_PREFIX="hd"
DISK_TYPE="GLUSTER"
FROM_APP="53e767ba8fb81d6a69000001"
FROM_APP_FILE="0"
FROM_APP_NAME="CentOS 6.5 - KVM - OpenNebula 4.8"
MD5="9d937b8fe70c403330c9284538f07cfc"

VIRTUAL MACHINES

    ID USER     GROUP    NAME            STAT UCPU    UMEM HOST             TIME
    53 oneadmin oneadmin CentOS 6.5 - KV boot    0    768M pre-openne   0d 00h12

onetemplate show

TEMPLATE 8 INFORMATION
ID             : 8
NAME           : CentOS 6.5 - KVM - OpenNebula 4.8_GLSUTER
USER           : oneadmin
GROUP          : oneadmin
REGISTER TIME  : 08/30 10:41:13

PERMISSIONS
OWNER          : um-
GROUP          : ---
OTHER          : ---

TEMPLATE CONTENTS
CONTEXT=[
  NETWORK="YES",
  SSH_PUBLIC_KEY="$USER[SSH_PUBLIC_KEY]" ]
CPU="1"
DISK=[
  DRIVER="qcow2",
  IMAGE="CentOS-6.5-one-4.8_GLUSTER",
  IMAGE_UNAME="oneadmin" ]
DISK_TYPE="GLUSTER"
FROM_APP="53e767ba8fb81d6a69000001"
FROM_APP_NAME="CentOS 6.5 - KVM - OpenNebula 4.8"
GRAPHICS=[
  LISTEN="0.0.0.0",
  TYPE="vnc" ]
MEMORY="768"
NIC=[
  NETWORK="private",
  NETWORK_UNAME="oneadmin" ]

2014-09-09 12:38 GMT+02:00 Javier Fontan <jfontan at opennebula.org>:
> That's right. Even if it is using GlusterFS the way of accessing the
> files is using the fuse filesystem. This makes the IO performance
> suffer.
>
> Do you get any errors in the log files related to gluster when you try
> to boot a machine with DISK_TYPE="GLUSTER" is activated? It could be a
> firewall or a permission problem. Make sure that the hypervisor host
> can access clu100 port 24007.
>
> Also make sure that the server has the rpc-auth-allow-insecure option
> configured and was restarted after the change.
>
> On Sat, Aug 30, 2014 at 11:44 AM, Marco Aroldi <marco.aroldi at gmail.com> wrote:
>> Hi all,
>> this is my first post to the list
>>
>> My goal is to get ONE 4.8 up and running using Gluster as datastore,
>> everything on CentOS 6.5
>> The problem: the VM remains stuck in BOOT status
>> I've found a way to boot the machines (see below), but I think is not
>> the correct way to manage this setup.
>>
>> First, let me describe what I've done until now:
>> I've followed the docs at
>> http://docs.opennebula.org/4.8/administration/storage/gluster_ds.html
>> and the post on the blog by Javier Fontan
>> http://opennebula.org/native-glusterfs-image-access-for-kvm-drivers/
>>
>> This is my Gluster volume:
>> Volume Name: sys-one
>> Type: Replicate
>> Volume ID: f1bf1bcc-0280-46db-aab8-69fd34672263
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: clu001:/one
>> Brick2: clu100:/one
>> Options Reconfigured:
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> network.remote-dio: enable
>> cluster.eager-lock: enable
>> performance.stat-prefetch: on
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> storage.owner-gid: 9869
>> storage.owner-uid: 9869
>> server.allow-insecure: on
>>
>> And the datastores:
>>   ID NAME                SIZE AVAIL CLUSTER      IMAGES TYPE DS       TM
>>    1 default           230.7G 86%   -                 6 img  fs       shared
>>    2 files             230.7G 86%   -                 0 fil  fs       ssh
>>  108 GLUSTER              24G 52%   clussssss         2 img  fs       shared
>>  110 new system           24G 52%   clussssss         0 sys  -        shared
>>
>>
>> DATASTORE 108 INFORMATION
>> ID             : 108
>> NAME           : GLUSTER
>> USER           : oneadmin
>> GROUP          : oneadmin
>> CLUSTER        : clussssss
>> TYPE           : IMAGE
>> DS_MAD         : fs
>> TM_MAD         : shared
>> BASE PATH      : /var/lib/one//datastores/108
>> DISK_TYPE      :
>>
>> DATASTORE CAPACITY
>> TOTAL:         : 24G
>> FREE:          : 12.5G
>> USED:          : 6.3G
>> LIMIT:         : 12.7G
>>
>> PERMISSIONS
>> OWNER          : um-
>> GROUP          : u--
>> OTHER          : ---
>>
>> DATASTORE TEMPLATE
>> BASE_PATH="/var/lib/one//datastores/"
>> CLONE_TARGET="SYSTEM"
>> DISK_TYPE="GLUSTER"
>> DS_MAD="fs"
>> GLUSTER_HOST="clu100:24007"
>> GLUSTER_VOLUME="sys-one"
>> LIMIT_MB="13000"
>> LN_TARGET="NONE"
>> TM_MAD="shared"
>> TYPE="IMAGE_DS"
>>
>> DATASTORE 110 INFORMATION
>> ID             : 110
>> NAME           : new system
>> USER           : oneadmin
>> GROUP          : oneadmin
>> CLUSTER        : clussssss
>> TYPE           : SYSTEM
>> DS_MAD         : -
>> TM_MAD         : shared
>> BASE PATH      : /var/lib/one//datastores/110
>> DISK_TYPE      : FILE
>>
>> DATASTORE CAPACITY
>> TOTAL:         : 24G
>> FREE:          : 12.5G
>> USED:          : 6.3G
>> LIMIT:         : -
>>
>> PERMISSIONS
>> OWNER          : um-
>> GROUP          : u--
>> OTHER          : ---
>>
>> DATASTORE TEMPLATE
>> BASE_PATH="/var/lib/one//datastores/"
>> SHARED="YES"
>> TM_MAD="shared"
>> TYPE="SYSTEM_DS"
>>
>>
>> Here is the mounted glusterfs:
>> clu100:/sys-one on /gluster type fuse.glusterfs
>> (rw,default_permissions,allow_other,max_read=131072)
>>
>> And the symbolic links in the datastores directory:
>> lrwxrwxrwx  1 oneadmin oneadmin    8 Aug 30 10:18 108 -> /gluster
>> lrwxrwxrwx  1 oneadmin oneadmin    8 Aug 30 10:18 110 -> /gluster
>>
>> I've found the culprit in the system datastore:
>> Created a new system datastore ON THE LOCAL FILESYSTEM:
>>
>> 111 system            230.7G 86%   -                 0 sys  -        shared
>>
>> DATASTORE 111 INFORMATION
>> ID             : 111
>> NAME           : system
>> USER           : oneadmin
>> GROUP          : oneadmin
>> CLUSTER        : -
>> TYPE           : SYSTEM
>> DS_MAD         : -
>> TM_MAD         : shared
>> BASE PATH      : /var/lib/one//datastores/111
>> DISK_TYPE      : FILE
>>
>> DATASTORE CAPACITY
>> TOTAL:         : 230.7G
>> FREE:          : 199.2G
>> USED:          : 1M
>> LIMIT:         : -
>>
>> PERMISSIONS
>> OWNER          : um-
>> GROUP          : u--
>> OTHER          : ---
>>
>> DATASTORE TEMPLATE
>> BASE_PATH="/var/lib/one//datastores/"
>> SHARED="YES"
>> TM_MAD="shared"
>> TYPE="SYSTEM_DS"
>>
>> Deploying now puts the VM in RUNNING status but, correct me if I'm
>> wrong, this setup is not compliant, right?
>> Thanks for the help
>>
>> Marco
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>
>
> --
> Javier Fontán Muiños
> Developer
> OpenNebula - Flexible Enterprise Cloud Made Simple
> www.OpenNebula.org | @OpenNebula | github.com/jfontan