[one-users] iSCSI multipath

Mihály Héder mihaly.heder at sztaki.mta.hu
Thu Jan 24 14:32:29 PST 2013


Hi,

Well, if you can run the lvs or lvscan on at least one server
successfully, then the metadata is probably fine.
We had similar issues before we learned how to exclude unnecessary
block devices in the lvm config.

The thing is that lvscan and lvs will try to check _every_ potential
block device by default for LVM partitions. If you are lucky, this is
only annoying, because it will throw 'can't read /dev/sdX' or similar
messages. However, if you are using dm-multipath, you will have one
device for each path, like /dev/sdr _plus_ the aggregated device with
the name you have configured in multipath.conf (/dev/mapper/yourname)
what you actually need. LVM did not quite understand this situation
and got stuck on the individual path devices, so we have configured to
look for lvm only on the right place. In man page of lvm.conf look for
the devices / scan and filter options. Also there are quite good
examples in the comments there.

Also, there could be a much simpler explanation to the issue:
something with the iSCSI connection or multipath that are one layer
below.

I hope this helps.

Cheers
Mihály

On 24 January 2013 23:18, Miloš Kozák <milos.kozak at lejmr.com> wrote:
> Hi, thank you. I tried to update TM ln script, which works but it is not
> clean solution. So I will try to write hook code and then we can discuss it.
>
> I deployed a few VM and now on the other server lvs command freezes. I have
> not set up clvm, do you think it could be caused by lvm metadata corruption?
> The thing is I can not longer start a VM on the other server.
>
> Miloš
>
> Dne 24.1.2013 23:10, Mihály Héder napsal(a):
>
>> Hi!
>>
>> We solve this problem via hooks that are activating the LV-s for us
>> when we start/migrate a VM. Unfortunately I will be out of office
>> until early next week but then I will consult with my colleague who
>> did the actual coding of this part and we will share the code.
>>
>> Cheers
>> Mihály
>>
>> On 24 January 2013 20:15, Miloš Kozák<milos.kozak at lejmr.com>  wrote:
>>>
>>> Hi, I have just set it up having two hosts with shared blockdevice. On
>>> top
>>> of that LVM, as discussed earlier. Triggering lvs I can see all logical
>>> volumes. When I create a new LV  on the other server, I can see the LV
>>> being
>>> inactive, so I have to run lvchange -ay VG/LV enable it then this LV can
>>> be
>>> used for VM..
>>>
>>> Is there any trick howto auto enable newly created LV on every host?
>>>
>>> Thanks Milos
>>>
>>> Dne 22.1.2013 18:22, Mihály Héder napsal(a):
>>>
>>>> Hi!
>>>>
>>>> You need to look at locking_type in the lvm.conf manual [1]. The
>>>> default - locking in a local directory - is ok for the frontend, and
>>>> type 4 is read-only. However, you should not forget that this only
>>>> prevents damaging thing by the lvm commands. If you start to write
>>>> zeros to your disk with the dd command for example, that will kill
>>>> your partition regardless the lvm setting. So this is against user or
>>>> middleware errors mainly, not against malicious attacks.
>>>>
>>>> Cheers
>>>> Mihály Héder
>>>> MTA SZTAKI
>>>>
>>>> [1] http://linux.die.net/man/5/lvm.conf
>>>>
>>>> On 21 January 2013 18:58, Miloš Kozák<milos.kozak at lejmr.com>   wrote:
>>>>>
>>>>> Oh snap, that sounds great I didn't know about that.. it makes all
>>>>> easier.
>>>>> In this scenario only frontend can work with LVM, so no issues of
>>>>> concurrent
>>>>> change. Only one last think to make it really safe against that. Is
>>>>> there
>>>>> any way to suppress LVM changes from hosts, make it read only? And let
>>>>> it
>>>>> RW
>>>>> at frontend?
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> Dne 21.1.2013 18:50, Mihály Héder napsal(a):
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> no, you don't have to do any of that. Also, nebula doesn't have to
>>>>>> care about LVM metadata at all and therefore there is no corresponding
>>>>>> function in it. At /etc/lvm there is no metadata, only configuration
>>>>>> files.
>>>>>>
>>>>>> Lvm metadata simply sits somewhere at the beginning of your
>>>>>> iscsi-shared disk, like a partition table. So it is on the storage
>>>>>> that is accessed by all your hosts, and no distribution is necessary.
>>>>>> Nebula frontend simply issues lvcreate, lvchange, etc, on this shared
>>>>>> disk and those commands will manipulate the metadata.
>>>>>>
>>>>>> It is really LVM's internal business, many layers below opennebula.
>>>>>> All you have to make sure that you don't run these commands
>>>>>> concurrently  from multiple hosts on the same iscsi-attached disk,
>>>>>> because then they could interfere with each other. This setting is
>>>>>> what you have to indicate in /etc/lvm on the server hosts.
>>>>>>
>>>>>> Cheers
>>>>>> Mihály
>>>>>>
>>>>>> On 21 January 2013 18:37, Miloš Kozák<milos.kozak at lejmr.com>   wrote:
>>>>>>>
>>>>>>> Thank you. does it mean, that I can distribute metadata files located
>>>>>>> in
>>>>>>> /etc/lvm on frontend onto other hosts and these hosts will see my
>>>>>>> logical
>>>>>>> volumes? Is there any code in nebula which would provide it? Or I
>>>>>>> need
>>>>>>> to
>>>>>>> update DS scripts to update/distribute LVM metadata among servers?
>>>>>>>
>>>>>>> Thanks, Milos
>>>>>>>
>>>>>>> Dne 21.1.2013 18:29, Mihály Héder napsal(a):
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> lvm metadata[1] is simply stored on the disk. In the setup we are
>>>>>>>> discussing this happens to be a  shared virtual disk on the storage,
>>>>>>>> so any other hosts that are attaching the same virtual disk should
>>>>>>>> see
>>>>>>>> the changes as they happen, provided that they re-read the disk.
>>>>>>>> This
>>>>>>>> re-reading step is what you can trigger with lvscan, but nowadays
>>>>>>>> that
>>>>>>>> seems to be unnecessary. For us it works with Centos 6.3 so I guess
>>>>>>>> Sc
>>>>>>>> Linux should be fine as well.
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Mihály
>>>>>>>>
>>>>>>>>
>>>>>>>> [1]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> http://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/lvm_metadata.html
>>>>>>>>
>>>>>>>> On 21 January 2013 12:53, Miloš Kozák<milos.kozak at lejmr.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> thank you for great answer. As I wrote my objective is to avoid as
>>>>>>>>> much
>>>>>>>>> of
>>>>>>>>> clustering sw (pacemaker,..) as possible, so clvm is one of these
>>>>>>>>> things
>>>>>>>>> I
>>>>>>>>> feel bad about them in my configuration.. Therefore I would rather
>>>>>>>>> let
>>>>>>>>> nebula manage LVM metadata in the first place as I you wrote. Only
>>>>>>>>> one
>>>>>>>>> last
>>>>>>>>> thing I dont understand is a way nebula distributes LVM metadata?
>>>>>>>>>
>>>>>>>>> Is kernel in Scientific Linux 6.3 new enought to LVM issue you
>>>>>>>>> mentioned?
>>>>>>>>>
>>>>>>>>> Thanks Milos
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dne 21.1.2013 12:34, Mihály Héder napsal(a):
>>>>>>>>>
>>>>>>>>>> Hi!
>>>>>>>>>>
>>>>>>>>>> Last time we could test an Equalogic it did not have option for
>>>>>>>>>> create/configure Virtual Disks inside in it by an API, so I think
>>>>>>>>>> the
>>>>>>>>>> iSCSI driver is not an alternative, as it would require a
>>>>>>>>>> configuration step per virtual machine on the storage.
>>>>>>>>>>
>>>>>>>>>> However, you can use your storage just fine in a shared LVM
>>>>>>>>>> scenario.
>>>>>>>>>> You need to consider two different things:
>>>>>>>>>> -the LVM metadata, and the actual VM data on the partitions. It is
>>>>>>>>>> true, that the concurrent modification of the metadata should be
>>>>>>>>>> avoided as in theory it can damage the whole virtual group. You
>>>>>>>>>> could
>>>>>>>>>> use clvm which avoids that by clustered locking, and then every
>>>>>>>>>> participating machine can safely create/modify/delete LV-s.
>>>>>>>>>> However,
>>>>>>>>>> in a nebula setup this is not necessary in every case: you can
>>>>>>>>>> make
>>>>>>>>>> the LVM metadata read only on your host servers, and let only the
>>>>>>>>>> frontend modify it. Then it can use local locking that does not
>>>>>>>>>> require clvm.
>>>>>>>>>> -of course the host servers can write the data inside the
>>>>>>>>>> partitions
>>>>>>>>>> regardless that the metadata is read-only for them. It should work
>>>>>>>>>> just fine as long as you don't start two VMs for one partition.
>>>>>>>>>>
>>>>>>>>>> We are running this setup with a dual controller Dell MD3600
>>>>>>>>>> storage
>>>>>>>>>> without issues so far. Before that, we used to do the same with
>>>>>>>>>> XEN
>>>>>>>>>> machines for years on an older EMC (that was before nebula). Now
>>>>>>>>>> with
>>>>>>>>>> nebula we have been using a home-grown module for doing that,
>>>>>>>>>> which
>>>>>>>>>> I
>>>>>>>>>> can send you any time - we plan to submit that as a feature
>>>>>>>>>> enhancement anyway. Also, there seems to be a similar shared LVM
>>>>>>>>>> module in the nebula upstream which we could not get to work yet,
>>>>>>>>>> but
>>>>>>>>>> did not try much.
>>>>>>>>>>
>>>>>>>>>> The plus side of this setup is that you can make live migration
>>>>>>>>>> work
>>>>>>>>>> nicely. There are two points to consider however: once you set the
>>>>>>>>>> LVM
>>>>>>>>>> metadata read-only you wont be able to modify the local LVMs in
>>>>>>>>>> your
>>>>>>>>>> servers, if there are any. Also, in older kernels, when you
>>>>>>>>>> modified
>>>>>>>>>> the LVM on one machine the others did not get notified about the
>>>>>>>>>> changes, so you had to issue an lvs command. However in new
>>>>>>>>>> kernels
>>>>>>>>>> this issue seems to be solved, the LVs get instantly updated. I
>>>>>>>>>> don't
>>>>>>>>>> know when and what exactly changed though.
>>>>>>>>>>
>>>>>>>>>> Cheers
>>>>>>>>>> Mihály Héder
>>>>>>>>>> MTA SZTAKI ITAK
>>>>>>>>>>
>>>>>>>>>> On 18 January 2013 08:57, Miloš Kozák<milos.kozak at lejmr.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi, I am setting up a small installation of opennebula with
>>>>>>>>>>> sharedstorage
>>>>>>>>>>> using iSCSI. THe storage is Equilogic EMC with two controllers.
>>>>>>>>>>> Nowadays
>>>>>>>>>>> we
>>>>>>>>>>> have only two host servers so we use backed direct connection
>>>>>>>>>>> between
>>>>>>>>>>> storage and each server, see attachment. For this purpose we set
>>>>>>>>>>> up
>>>>>>>>>>> dm-multipath. Cause in the future we want to add other servers
>>>>>>>>>>> and
>>>>>>>>>>> some
>>>>>>>>>>> other technology will be necessary in the network segment.
>>>>>>>>>>> Thesedays
>>>>>>>>>>> we
>>>>>>>>>>> try
>>>>>>>>>>> to make it as same as possible with future topology from
>>>>>>>>>>> protocols
>>>>>>>>>>> point
>>>>>>>>>>> of
>>>>>>>>>>> view.
>>>>>>>>>>>
>>>>>>>>>>> My question is related to the way how to define datastore, which
>>>>>>>>>>> driver
>>>>>>>>>>> and
>>>>>>>>>>> TM is the best and which?
>>>>>>>>>>>
>>>>>>>>>>> My primal objective is to avoid GFS2 or any other cluster
>>>>>>>>>>> filesystem
>>>>>>>>>>> I
>>>>>>>>>>> would
>>>>>>>>>>> prefer to keep datastore as block devices. Only option I see is
>>>>>>>>>>> to
>>>>>>>>>>> use
>>>>>>>>>>> LVM
>>>>>>>>>>> but I worry about concurent writes isn't it a problem? I was
>>>>>>>>>>> googling
>>>>>>>>>>> a
>>>>>>>>>>> bit
>>>>>>>>>>> and I found I would need to set up clvm - is it really necessary?
>>>>>>>>>>>
>>>>>>>>>>> Or is better to use iSCSI driver, drop the dm-multipath and hope?
>>>>>>>>>>>
>>>>>>>>>>> Thanks, Milos
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Users mailing list
>>>>>>>>>>> Users at lists.opennebula.org
>>>>>>>>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list
>>>>>>> Users at lists.opennebula.org
>>>>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>>>
>>>>>
>



More information about the Users mailing list