[one-users] iSCSI multipath

Marlok Tamás tmarlok at sztaki.hu
Wed Jan 30 04:59:02 PST 2013


Hi,

We are running it without CLVM.
If you examine the ONE/lvm driver (the tm/clone script for example), you
can see, that the lvcreate command runs on the destination host. In the
shared LVM driver, all the LVM commands are running on the frontend, hence
there is no possibility of parralel changes (assuming that you are using
only 1 frontend), because local locking is in effect on the frontend.

The other thing with the ONE/lvm driver is that it makes a snapshot in the
clone script, while our driver makes a new clone LV. I tried to use the
original LVM driver, and every time, I deployed a new VM, I got this error
message:

lv-one-50 must be active exclusively to create snapshot

If you (or everyone else) knows, how to avoid this error, please let me
know.
Besides that snapshots are much slower in write operations (as far as I
know).

Hope this helps!
--
Cheers,
tmarlok


On Wed, Jan 30, 2013 at 1:37 PM, Miloš Kozák <milos.kozak at lejmr.com> wrote:

>  Hi, thank you. I checked source codes and I found it is very similar to
> LVM TM/Datastore drivers which is facilitated in ONE already only you added
> lvchange -ay DEV. Do you run CLVM along that or not?
>
> I worry about parallel changes of LVM metadata which might destroy them.
> From sequential behaviour it is probably not an issues can you prove it to
> me? Or  is it highly dangerous to run lvm_shared without CLVM?
>
> Thanks, Milos
>
>
> Dne 30.1.2013 10:09, Marlok Tamás napsal(a):
>
> Hi,
>
> We have a custom datastore, and transfer manager driver, which runs the
> lvchange command when it is needed.
> In order to work, you have to enable it in oned.conf.
>
> for example:
>
> DATASTORE_MAD = [
>     executable = "one_datastore",
>     arguments  = "-t 10 -d fs,vmware,iscsi,lvm,shared_lvm"]
>
> TM_MAD = [
>     executable = "one_tm",
>     arguments  = "-t 10 -d
> dummy,lvm,shared,qcow2,ssh,vmware,iscsi,shared_lvm" ]
>
> After that, you can create a datastore, with the shared_lvm tm and
> datastore driver.
>
> The only limitation is that you can't live migrate VM-s. We have a working
> solution for that as well, but it is still untested.I can send you that
> too, if you want to help us testing it.
>
> Anyway, here are the drivers, feel free to use or modify it.
> https://dl.dropbox.com/u/140123/shared_lvm.tar.gz
>
> --
> Cheers,
> Marlok Tamas
> MTA Sztaki
>
>
>
> On Thu, Jan 24, 2013 at 11:32 PM, Mihály Héder <mihaly.heder at sztaki.mta.hu
> > wrote:
>
>> Hi,
>>
>> Well, if you can run the lvs or lvscan on at least one server
>> successfully, then the metadata is probably fine.
>> We had similar issues before we learned how to exclude unnecessary
>> block devices in the lvm config.
>>
>> The thing is that lvscan and lvs will try to check _every_ potential
>> block device by default for LVM partitions. If you are lucky, this is
>> only annoying, because it will throw 'can't read /dev/sdX' or similar
>> messages. However, if you are using dm-multipath, you will have one
>> device for each path, like /dev/sdr _plus_ the aggregated device with
>> the name you have configured in multipath.conf (/dev/mapper/yourname)
>> what you actually need. LVM did not quite understand this situation
>> and got stuck on the individual path devices, so we have configured to
>> look for lvm only on the right place. In man page of lvm.conf look for
>> the devices / scan and filter options. Also there are quite good
>> examples in the comments there.
>>
>> Also, there could be a much simpler explanation to the issue:
>> something with the iSCSI connection or multipath that are one layer
>> below.
>>
>> I hope this helps.
>>
>> Cheers
>> Mihály
>>
>> On 24 January 2013 23:18, Miloš Kozák <milos.kozak at lejmr.com> wrote:
>> > Hi, thank you. I tried to update TM ln script, which works but it is not
>> > clean solution. So I will try to write hook code and then we can
>> discuss it.
>> >
>> > I deployed a few VM and now on the other server lvs command freezes. I
>> have
>> > not set up clvm, do you think it could be caused by lvm metadata
>> corruption?
>> > The thing is I can not longer start a VM on the other server.
>> >
>> > Miloš
>> >
>> > Dne 24.1.2013 23:10, Mihály Héder napsal(a):
>>  >
>> >> Hi!
>> >>
>> >> We solve this problem via hooks that are activating the LV-s for us
>> >> when we start/migrate a VM. Unfortunately I will be out of office
>> >> until early next week but then I will consult with my colleague who
>> >> did the actual coding of this part and we will share the code.
>> >>
>> >> Cheers
>> >> Mihály
>> >>
>> >> On 24 January 2013 20:15, Miloš Kozák<milos.kozak at lejmr.com>  wrote:
>> >>>
>> >>> Hi, I have just set it up having two hosts with shared blockdevice. On
>> >>> top
>> >>> of that LVM, as discussed earlier. Triggering lvs I can see all
>> logical
>> >>> volumes. When I create a new LV  on the other server, I can see the LV
>> >>> being
>> >>> inactive, so I have to run lvchange -ay VG/LV enable it then this LV
>> can
>> >>> be
>> >>> used for VM..
>> >>>
>> >>> Is there any trick howto auto enable newly created LV on every host?
>> >>>
>> >>> Thanks Milos
>> >>>
>> >>> Dne 22.1.2013 18:22, Mihály Héder napsal(a):
>> >>>
>> >>>> Hi!
>> >>>>
>> >>>> You need to look at locking_type in the lvm.conf manual [1]. The
>> >>>> default - locking in a local directory - is ok for the frontend, and
>> >>>> type 4 is read-only. However, you should not forget that this only
>> >>>> prevents damaging thing by the lvm commands. If you start to write
>> >>>> zeros to your disk with the dd command for example, that will kill
>> >>>> your partition regardless the lvm setting. So this is against user or
>> >>>> middleware errors mainly, not against malicious attacks.
>> >>>>
>> >>>> Cheers
>> >>>> Mihály Héder
>> >>>> MTA SZTAKI
>> >>>>
>> >>>> [1] http://linux.die.net/man/5/lvm.conf
>> >>>>
>> >>>> On 21 January 2013 18:58, Miloš Kozák<milos.kozak at lejmr.com>
>> wrote:
>> >>>>>
>> >>>>> Oh snap, that sounds great I didn't know about that.. it makes all
>> >>>>> easier.
>> >>>>> In this scenario only frontend can work with LVM, so no issues of
>> >>>>> concurrent
>> >>>>> change. Only one last think to make it really safe against that. Is
>> >>>>> there
>> >>>>> any way to suppress LVM changes from hosts, make it read only? And
>> let
>> >>>>> it
>> >>>>> RW
>> >>>>> at frontend?
>> >>>>>
>> >>>>> Thanks
>> >>>>>
>> >>>>>
>> >>>>> Dne 21.1.2013 18:50, Mihály Héder napsal(a):
>> >>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> no, you don't have to do any of that. Also, nebula doesn't have to
>> >>>>>> care about LVM metadata at all and therefore there is no
>> corresponding
>> >>>>>> function in it. At /etc/lvm there is no metadata, only
>> configuration
>> >>>>>> files.
>> >>>>>>
>> >>>>>> Lvm metadata simply sits somewhere at the beginning of your
>> >>>>>> iscsi-shared disk, like a partition table. So it is on the storage
>> >>>>>> that is accessed by all your hosts, and no distribution is
>> necessary.
>> >>>>>> Nebula frontend simply issues lvcreate, lvchange, etc, on this
>> shared
>> >>>>>> disk and those commands will manipulate the metadata.
>> >>>>>>
>> >>>>>> It is really LVM's internal business, many layers below opennebula.
>> >>>>>> All you have to make sure that you don't run these commands
>> >>>>>> concurrently  from multiple hosts on the same iscsi-attached disk,
>> >>>>>> because then they could interfere with each other. This setting is
>> >>>>>> what you have to indicate in /etc/lvm on the server hosts.
>> >>>>>>
>> >>>>>> Cheers
>> >>>>>> Mihály
>> >>>>>>
>> >>>>>> On 21 January 2013 18:37, Miloš Kozák<milos.kozak at lejmr.com>
>> wrote:
>> >>>>>>>
>> >>>>>>> Thank you. does it mean, that I can distribute metadata files
>> located
>> >>>>>>> in
>> >>>>>>> /etc/lvm on frontend onto other hosts and these hosts will see my
>> >>>>>>> logical
>> >>>>>>> volumes? Is there any code in nebula which would provide it? Or I
>> >>>>>>> need
>> >>>>>>> to
>> >>>>>>> update DS scripts to update/distribute LVM metadata among servers?
>> >>>>>>>
>> >>>>>>> Thanks, Milos
>> >>>>>>>
>> >>>>>>> Dne 21.1.2013 18:29, Mihály Héder napsal(a):
>> >>>>>>>
>> >>>>>>>> Hi,
>> >>>>>>>>
>> >>>>>>>> lvm metadata[1] is simply stored on the disk. In the setup we are
>> >>>>>>>> discussing this happens to be a  shared virtual disk on the
>> storage,
>> >>>>>>>> so any other hosts that are attaching the same virtual disk
>> should
>> >>>>>>>> see
>> >>>>>>>> the changes as they happen, provided that they re-read the disk.
>> >>>>>>>> This
>> >>>>>>>> re-reading step is what you can trigger with lvscan, but nowadays
>> >>>>>>>> that
>> >>>>>>>> seems to be unnecessary. For us it works with Centos 6.3 so I
>> guess
>> >>>>>>>> Sc
>> >>>>>>>> Linux should be fine as well.
>> >>>>>>>>
>> >>>>>>>> Cheers
>> >>>>>>>> Mihály
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> [1]
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> http://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/lvm_metadata.html
>> >>>>>>>>
>> >>>>>>>> On 21 January 2013 12:53, Miloš Kozák<milos.kozak at lejmr.com>
>> >>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Hi,
>> >>>>>>>>> thank you for great answer. As I wrote my objective is to avoid
>> as
>> >>>>>>>>> much
>> >>>>>>>>> of
>> >>>>>>>>> clustering sw (pacemaker,..) as possible, so clvm is one of
>> these
>> >>>>>>>>> things
>> >>>>>>>>> I
>> >>>>>>>>> feel bad about them in my configuration.. Therefore I would
>> rather
>> >>>>>>>>> let
>> >>>>>>>>> nebula manage LVM metadata in the first place as I you wrote.
>> Only
>> >>>>>>>>> one
>> >>>>>>>>> last
>> >>>>>>>>> thing I dont understand is a way nebula distributes LVM
>> metadata?
>> >>>>>>>>>
>> >>>>>>>>> Is kernel in Scientific Linux 6.3 new enought to LVM issue you
>> >>>>>>>>> mentioned?
>> >>>>>>>>>
>> >>>>>>>>> Thanks Milos
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Dne 21.1.2013 12:34, Mihály Héder napsal(a):
>> >>>>>>>>>
>> >>>>>>>>>> Hi!
>> >>>>>>>>>>
>> >>>>>>>>>> Last time we could test an Equalogic it did not have option for
>> >>>>>>>>>> create/configure Virtual Disks inside in it by an API, so I
>> think
>> >>>>>>>>>> the
>> >>>>>>>>>> iSCSI driver is not an alternative, as it would require a
>> >>>>>>>>>> configuration step per virtual machine on the storage.
>> >>>>>>>>>>
>> >>>>>>>>>> However, you can use your storage just fine in a shared LVM
>> >>>>>>>>>> scenario.
>> >>>>>>>>>> You need to consider two different things:
>> >>>>>>>>>> -the LVM metadata, and the actual VM data on the partitions.
>> It is
>> >>>>>>>>>> true, that the concurrent modification of the metadata should
>> be
>> >>>>>>>>>> avoided as in theory it can damage the whole virtual group. You
>> >>>>>>>>>> could
>> >>>>>>>>>> use clvm which avoids that by clustered locking, and then every
>> >>>>>>>>>> participating machine can safely create/modify/delete LV-s.
>> >>>>>>>>>> However,
>> >>>>>>>>>> in a nebula setup this is not necessary in every case: you can
>> >>>>>>>>>> make
>> >>>>>>>>>> the LVM metadata read only on your host servers, and let only
>> the
>> >>>>>>>>>> frontend modify it. Then it can use local locking that does not
>> >>>>>>>>>> require clvm.
>> >>>>>>>>>> -of course the host servers can write the data inside the
>> >>>>>>>>>> partitions
>> >>>>>>>>>> regardless that the metadata is read-only for them. It should
>> work
>> >>>>>>>>>> just fine as long as you don't start two VMs for one partition.
>> >>>>>>>>>>
>> >>>>>>>>>> We are running this setup with a dual controller Dell MD3600
>> >>>>>>>>>> storage
>> >>>>>>>>>> without issues so far. Before that, we used to do the same with
>> >>>>>>>>>> XEN
>> >>>>>>>>>> machines for years on an older EMC (that was before nebula).
>> Now
>> >>>>>>>>>> with
>> >>>>>>>>>> nebula we have been using a home-grown module for doing that,
>> >>>>>>>>>> which
>> >>>>>>>>>> I
>> >>>>>>>>>> can send you any time - we plan to submit that as a feature
>> >>>>>>>>>> enhancement anyway. Also, there seems to be a similar shared
>> LVM
>> >>>>>>>>>> module in the nebula upstream which we could not get to work
>> yet,
>> >>>>>>>>>> but
>> >>>>>>>>>> did not try much.
>> >>>>>>>>>>
>> >>>>>>>>>> The plus side of this setup is that you can make live migration
>> >>>>>>>>>> work
>> >>>>>>>>>> nicely. There are two points to consider however: once you set
>> the
>> >>>>>>>>>> LVM
>> >>>>>>>>>> metadata read-only you wont be able to modify the local LVMs in
>> >>>>>>>>>> your
>> >>>>>>>>>> servers, if there are any. Also, in older kernels, when you
>> >>>>>>>>>> modified
>> >>>>>>>>>> the LVM on one machine the others did not get notified about
>> the
>> >>>>>>>>>> changes, so you had to issue an lvs command. However in new
>> >>>>>>>>>> kernels
>> >>>>>>>>>> this issue seems to be solved, the LVs get instantly updated. I
>> >>>>>>>>>> don't
>> >>>>>>>>>> know when and what exactly changed though.
>> >>>>>>>>>>
>> >>>>>>>>>> Cheers
>> >>>>>>>>>> Mihály Héder
>> >>>>>>>>>> MTA SZTAKI ITAK
>> >>>>>>>>>>
>> >>>>>>>>>> On 18 January 2013 08:57, Miloš Kozák<milos.kozak at lejmr.com>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>> Hi, I am setting up a small installation of opennebula with
>> >>>>>>>>>>> sharedstorage
>> >>>>>>>>>>> using iSCSI. THe storage is Equilogic EMC with two
>> controllers.
>> >>>>>>>>>>> Nowadays
>> >>>>>>>>>>> we
>> >>>>>>>>>>> have only two host servers so we use backed direct connection
>> >>>>>>>>>>> between
>> >>>>>>>>>>> storage and each server, see attachment. For this purpose we
>> set
>> >>>>>>>>>>> up
>> >>>>>>>>>>> dm-multipath. Cause in the future we want to add other servers
>> >>>>>>>>>>> and
>> >>>>>>>>>>> some
>> >>>>>>>>>>> other technology will be necessary in the network segment.
>> >>>>>>>>>>> Thesedays
>> >>>>>>>>>>> we
>> >>>>>>>>>>> try
>> >>>>>>>>>>> to make it as same as possible with future topology from
>> >>>>>>>>>>> protocols
>> >>>>>>>>>>> point
>> >>>>>>>>>>> of
>> >>>>>>>>>>> view.
>> >>>>>>>>>>>
>> >>>>>>>>>>> My question is related to the way how to define datastore,
>> which
>> >>>>>>>>>>> driver
>> >>>>>>>>>>> and
>> >>>>>>>>>>> TM is the best and which?
>> >>>>>>>>>>>
>> >>>>>>>>>>> My primal objective is to avoid GFS2 or any other cluster
>> >>>>>>>>>>> filesystem
>> >>>>>>>>>>> I
>> >>>>>>>>>>> would
>> >>>>>>>>>>> prefer to keep datastore as block devices. Only option I see
>> is
>> >>>>>>>>>>> to
>> >>>>>>>>>>> use
>> >>>>>>>>>>> LVM
>> >>>>>>>>>>> but I worry about concurent writes isn't it a problem? I was
>> >>>>>>>>>>> googling
>> >>>>>>>>>>> a
>> >>>>>>>>>>> bit
>> >>>>>>>>>>> and I found I would need to set up clvm - is it really
>> necessary?
>> >>>>>>>>>>>
>> >>>>>>>>>>> Or is better to use iSCSI driver, drop the dm-multipath and
>> hope?
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks, Milos
>> >>>>>>>>>>>
>> >>>>>>>>>>> _______________________________________________
>> >>>>>>>>>>> Users mailing list
>> >>>>>>>>>>> Users at lists.opennebula.org
>> >>>>>>>>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>> >>>>>>>>>>>
>> >>>>>>> _______________________________________________
>> >>>>>>> Users mailing list
>> >>>>>>> Users at lists.opennebula.org
>> >>>>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>> >>>>>
>> >>>>>
>> >
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20130130/5f2a0a5a/attachment-0002.htm>


More information about the Users mailing list