[one-users] iSCSI multipath

Miloš Kozák milos.kozak at lejmr.com
Wed Jan 30 09:47:11 PST 2013


Hi, it sounds interesting, I think I am going to give it a try. I still 
struggle whether to use or not CLVM. For how long you have been running 
it like that? Have you ever had any serious issues related to LVM?

Thank you, Milos


Dne 30.1.2013 13:59, Marlok Tamás napsal(a):
> Hi,
>
> We are running it without CLVM.
> If you examine the ONE/lvm driver (the tm/clone script for example), 
> you can see, that the lvcreate command runs on the destination host. 
> In the shared LVM driver, all the LVM commands are running on the 
> frontend, hence there is no possibility of parralel changes (assuming 
> that you are using only 1 frontend), because local locking is in 
> effect on the frontend.
>
> The other thing with the ONE/lvm driver is that it makes a snapshot in 
> the clone script, while our driver makes a new clone LV. I tried to 
> use the original LVM driver, and every time, I deployed a new VM, I 
> got this error message:
> lv-one-50 must be active exclusively to create snapshot
> If you (or everyone else) knows, how to avoid this error, please let 
> me know.
> Besides that snapshots are much slower in write operations (as far as 
> I know).
>
> Hope this helps!
> --
> Cheers,
> tmarlok
>
>
> On Wed, Jan 30, 2013 at 1:37 PM, Miloš Kozák <milos.kozak at lejmr.com 
> <mailto:milos.kozak at lejmr.com>> wrote:
>
>     Hi, thank you. I checked source codes and I found it is very
>     similar to LVM TM/Datastore drivers which is facilitated in ONE
>     already only you added lvchange -ay DEV. Do you run CLVM along
>     that or not?
>
>     I worry about parallel changes of LVM metadata which might destroy
>     them. From sequential behaviour it is probably not an issues can
>     you prove it to me? Or  is it highly dangerous to run lvm_shared
>     without CLVM?
>
>     Thanks, Milos
>
>
>     Dne 30.1.2013 10:09, Marlok Tamás napsal(a):
>>     Hi,
>>
>>     We have a custom datastore, and transfer manager driver, which
>>     runs the lvchange command when it is needed.
>>     In order to work, you have to enable it in oned.conf.
>>
>>     for example:
>>
>>     DATASTORE_MAD = [
>>         executable = "one_datastore",
>>         arguments  = "-t 10 -d fs,vmware,iscsi,lvm,shared_lvm"]
>>
>>     TM_MAD = [
>>         executable = "one_tm",
>>         arguments  = "-t 10 -d
>>     dummy,lvm,shared,qcow2,ssh,vmware,iscsi,shared_lvm" ]
>>
>>     After that, you can create a datastore, with the shared_lvm tm
>>     and datastore driver.
>>
>>     The only limitation is that you can't live migrate VM-s. We have
>>     a working solution for that as well, but it is still untested.I
>>     can send you that too, if you want to help us testing it.
>>
>>     Anyway, here are the drivers, feel free to use or modify it.
>>     https://dl.dropbox.com/u/140123/shared_lvm.tar.gz
>>
>>     --
>>     Cheers,
>>     Marlok Tamas
>>     MTA Sztaki
>>
>>
>>
>>     On Thu, Jan 24, 2013 at 11:32 PM, Mihály Héder
>>     <mihaly.heder at sztaki.mta.hu <mailto:mihaly.heder at sztaki.mta.hu>>
>>     wrote:
>>
>>         Hi,
>>
>>         Well, if you can run the lvs or lvscan on at least one server
>>         successfully, then the metadata is probably fine.
>>         We had similar issues before we learned how to exclude
>>         unnecessary
>>         block devices in the lvm config.
>>
>>         The thing is that lvscan and lvs will try to check _every_
>>         potential
>>         block device by default for LVM partitions. If you are lucky,
>>         this is
>>         only annoying, because it will throw 'can't read /dev/sdX' or
>>         similar
>>         messages. However, if you are using dm-multipath, you will
>>         have one
>>         device for each path, like /dev/sdr _plus_ the aggregated
>>         device with
>>         the name you have configured in multipath.conf
>>         (/dev/mapper/yourname)
>>         what you actually need. LVM did not quite understand this
>>         situation
>>         and got stuck on the individual path devices, so we have
>>         configured to
>>         look for lvm only on the right place. In man page of lvm.conf
>>         look for
>>         the devices / scan and filter options. Also there are quite good
>>         examples in the comments there.
>>
>>         Also, there could be a much simpler explanation to the issue:
>>         something with the iSCSI connection or multipath that are one
>>         layer
>>         below.
>>
>>         I hope this helps.
>>
>>         Cheers
>>         Mihály
>>
>>         On 24 January 2013 23:18, Miloš Kozák <milos.kozak at lejmr.com
>>         <mailto:milos.kozak at lejmr.com>> wrote:
>>         > Hi, thank you. I tried to update TM ln script, which works
>>         but it is not
>>         > clean solution. So I will try to write hook code and then
>>         we can discuss it.
>>         >
>>         > I deployed a few VM and now on the other server lvs command
>>         freezes. I have
>>         > not set up clvm, do you think it could be caused by lvm
>>         metadata corruption?
>>         > The thing is I can not longer start a VM on the other server.
>>         >
>>         > Miloš
>>         >
>>         > Dne 24.1.2013 23:10, Mihály Héder napsal(a):
>>         >
>>         >> Hi!
>>         >>
>>         >> We solve this problem via hooks that are activating the
>>         LV-s for us
>>         >> when we start/migrate a VM. Unfortunately I will be out of
>>         office
>>         >> until early next week but then I will consult with my
>>         colleague who
>>         >> did the actual coding of this part and we will share the code.
>>         >>
>>         >> Cheers
>>         >> Mihály
>>         >>
>>         >> On 24 January 2013 20:15, Miloš
>>         Kozák<milos.kozak at lejmr.com <mailto:milos.kozak at lejmr.com>>
>>          wrote:
>>         >>>
>>         >>> Hi, I have just set it up having two hosts with shared
>>         blockdevice. On
>>         >>> top
>>         >>> of that LVM, as discussed earlier. Triggering lvs I can
>>         see all logical
>>         >>> volumes. When I create a new LV  on the other server, I
>>         can see the LV
>>         >>> being
>>         >>> inactive, so I have to run lvchange -ay VG/LV enable it
>>         then this LV can
>>         >>> be
>>         >>> used for VM..
>>         >>>
>>         >>> Is there any trick howto auto enable newly created LV on
>>         every host?
>>         >>>
>>         >>> Thanks Milos
>>         >>>
>>         >>> Dne 22.1.2013 18:22, Mihály Héder napsal(a):
>>         >>>
>>         >>>> Hi!
>>         >>>>
>>         >>>> You need to look at locking_type in the lvm.conf manual
>>         [1]. The
>>         >>>> default - locking in a local directory - is ok for the
>>         frontend, and
>>         >>>> type 4 is read-only. However, you should not forget that
>>         this only
>>         >>>> prevents damaging thing by the lvm commands. If you
>>         start to write
>>         >>>> zeros to your disk with the dd command for example, that
>>         will kill
>>         >>>> your partition regardless the lvm setting. So this is
>>         against user or
>>         >>>> middleware errors mainly, not against malicious attacks.
>>         >>>>
>>         >>>> Cheers
>>         >>>> Mihály Héder
>>         >>>> MTA SZTAKI
>>         >>>>
>>         >>>> [1] http://linux.die.net/man/5/lvm.conf
>>         >>>>
>>         >>>> On 21 January 2013 18:58, Miloš
>>         Kozák<milos.kozak at lejmr.com <mailto:milos.kozak at lejmr.com>>  
>>         wrote:
>>         >>>>>
>>         >>>>> Oh snap, that sounds great I didn't know about that..
>>         it makes all
>>         >>>>> easier.
>>         >>>>> In this scenario only frontend can work with LVM, so no
>>         issues of
>>         >>>>> concurrent
>>         >>>>> change. Only one last think to make it really safe
>>         against that. Is
>>         >>>>> there
>>         >>>>> any way to suppress LVM changes from hosts, make it
>>         read only? And let
>>         >>>>> it
>>         >>>>> RW
>>         >>>>> at frontend?
>>         >>>>>
>>         >>>>> Thanks
>>         >>>>>
>>         >>>>>
>>         >>>>> Dne 21.1.2013 18:50, Mihály Héder napsal(a):
>>         >>>>>
>>         >>>>>> Hi,
>>         >>>>>>
>>         >>>>>> no, you don't have to do any of that. Also, nebula
>>         doesn't have to
>>         >>>>>> care about LVM metadata at all and therefore there is
>>         no corresponding
>>         >>>>>> function in it. At /etc/lvm there is no metadata, only
>>         configuration
>>         >>>>>> files.
>>         >>>>>>
>>         >>>>>> Lvm metadata simply sits somewhere at the beginning of
>>         your
>>         >>>>>> iscsi-shared disk, like a partition table. So it is on
>>         the storage
>>         >>>>>> that is accessed by all your hosts, and no
>>         distribution is necessary.
>>         >>>>>> Nebula frontend simply issues lvcreate, lvchange, etc,
>>         on this shared
>>         >>>>>> disk and those commands will manipulate the metadata.
>>         >>>>>>
>>         >>>>>> It is really LVM's internal business, many layers
>>         below opennebula.
>>         >>>>>> All you have to make sure that you don't run these
>>         commands
>>         >>>>>> concurrently  from multiple hosts on the same
>>         iscsi-attached disk,
>>         >>>>>> because then they could interfere with each other.
>>         This setting is
>>         >>>>>> what you have to indicate in /etc/lvm on the server hosts.
>>         >>>>>>
>>         >>>>>> Cheers
>>         >>>>>> Mihály
>>         >>>>>>
>>         >>>>>> On 21 January 2013 18:37, Miloš
>>         Kozák<milos.kozak at lejmr.com <mailto:milos.kozak at lejmr.com>>  
>>         wrote:
>>         >>>>>>>
>>         >>>>>>> Thank you. does it mean, that I can distribute
>>         metadata files located
>>         >>>>>>> in
>>         >>>>>>> /etc/lvm on frontend onto other hosts and these hosts
>>         will see my
>>         >>>>>>> logical
>>         >>>>>>> volumes? Is there any code in nebula which would
>>         provide it? Or I
>>         >>>>>>> need
>>         >>>>>>> to
>>         >>>>>>> update DS scripts to update/distribute LVM metadata
>>         among servers?
>>         >>>>>>>
>>         >>>>>>> Thanks, Milos
>>         >>>>>>>
>>         >>>>>>> Dne 21.1.2013 18:29, Mihály Héder napsal(a):
>>         >>>>>>>
>>         >>>>>>>> Hi,
>>         >>>>>>>>
>>         >>>>>>>> lvm metadata[1] is simply stored on the disk. In the
>>         setup we are
>>         >>>>>>>> discussing this happens to be a  shared virtual disk
>>         on the storage,
>>         >>>>>>>> so any other hosts that are attaching the same
>>         virtual disk should
>>         >>>>>>>> see
>>         >>>>>>>> the changes as they happen, provided that they
>>         re-read the disk.
>>         >>>>>>>> This
>>         >>>>>>>> re-reading step is what you can trigger with lvscan,
>>         but nowadays
>>         >>>>>>>> that
>>         >>>>>>>> seems to be unnecessary. For us it works with Centos
>>         6.3 so I guess
>>         >>>>>>>> Sc
>>         >>>>>>>> Linux should be fine as well.
>>         >>>>>>>>
>>         >>>>>>>> Cheers
>>         >>>>>>>> Mihály
>>         >>>>>>>>
>>         >>>>>>>>
>>         >>>>>>>> [1]
>>         >>>>>>>>
>>         >>>>>>>>
>>         >>>>>>>>
>>         >>>>>>>>
>>         http://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/lvm_metadata.html
>>         >>>>>>>>
>>         >>>>>>>> On 21 January 2013 12:53, Miloš
>>         Kozák<milos.kozak at lejmr.com <mailto:milos.kozak at lejmr.com>>
>>         >>>>>>>> wrote:
>>         >>>>>>>>>
>>         >>>>>>>>> Hi,
>>         >>>>>>>>> thank you for great answer. As I wrote my objective
>>         is to avoid as
>>         >>>>>>>>> much
>>         >>>>>>>>> of
>>         >>>>>>>>> clustering sw (pacemaker,..) as possible, so clvm
>>         is one of these
>>         >>>>>>>>> things
>>         >>>>>>>>> I
>>         >>>>>>>>> feel bad about them in my configuration.. Therefore
>>         I would rather
>>         >>>>>>>>> let
>>         >>>>>>>>> nebula manage LVM metadata in the first place as I
>>         you wrote. Only
>>         >>>>>>>>> one
>>         >>>>>>>>> last
>>         >>>>>>>>> thing I dont understand is a way nebula distributes
>>         LVM metadata?
>>         >>>>>>>>>
>>         >>>>>>>>> Is kernel in Scientific Linux 6.3 new enought to
>>         LVM issue you
>>         >>>>>>>>> mentioned?
>>         >>>>>>>>>
>>         >>>>>>>>> Thanks Milos
>>         >>>>>>>>>
>>         >>>>>>>>>
>>         >>>>>>>>>
>>         >>>>>>>>>
>>         >>>>>>>>> Dne 21.1.2013 12:34, Mihály Héder napsal(a):
>>         >>>>>>>>>
>>         >>>>>>>>>> Hi!
>>         >>>>>>>>>>
>>         >>>>>>>>>> Last time we could test an Equalogic it did not
>>         have option for
>>         >>>>>>>>>> create/configure Virtual Disks inside in it by an
>>         API, so I think
>>         >>>>>>>>>> the
>>         >>>>>>>>>> iSCSI driver is not an alternative, as it would
>>         require a
>>         >>>>>>>>>> configuration step per virtual machine on the storage.
>>         >>>>>>>>>>
>>         >>>>>>>>>> However, you can use your storage just fine in a
>>         shared LVM
>>         >>>>>>>>>> scenario.
>>         >>>>>>>>>> You need to consider two different things:
>>         >>>>>>>>>> -the LVM metadata, and the actual VM data on the
>>         partitions. It is
>>         >>>>>>>>>> true, that the concurrent modification of the
>>         metadata should be
>>         >>>>>>>>>> avoided as in theory it can damage the whole
>>         virtual group. You
>>         >>>>>>>>>> could
>>         >>>>>>>>>> use clvm which avoids that by clustered locking,
>>         and then every
>>         >>>>>>>>>> participating machine can safely
>>         create/modify/delete LV-s.
>>         >>>>>>>>>> However,
>>         >>>>>>>>>> in a nebula setup this is not necessary in every
>>         case: you can
>>         >>>>>>>>>> make
>>         >>>>>>>>>> the LVM metadata read only on your host servers,
>>         and let only the
>>         >>>>>>>>>> frontend modify it. Then it can use local locking
>>         that does not
>>         >>>>>>>>>> require clvm.
>>         >>>>>>>>>> -of course the host servers can write the data
>>         inside the
>>         >>>>>>>>>> partitions
>>         >>>>>>>>>> regardless that the metadata is read-only for
>>         them. It should work
>>         >>>>>>>>>> just fine as long as you don't start two VMs for
>>         one partition.
>>         >>>>>>>>>>
>>         >>>>>>>>>> We are running this setup with a dual controller
>>         Dell MD3600
>>         >>>>>>>>>> storage
>>         >>>>>>>>>> without issues so far. Before that, we used to do
>>         the same with
>>         >>>>>>>>>> XEN
>>         >>>>>>>>>> machines for years on an older EMC (that was
>>         before nebula). Now
>>         >>>>>>>>>> with
>>         >>>>>>>>>> nebula we have been using a home-grown module for
>>         doing that,
>>         >>>>>>>>>> which
>>         >>>>>>>>>> I
>>         >>>>>>>>>> can send you any time - we plan to submit that as
>>         a feature
>>         >>>>>>>>>> enhancement anyway. Also, there seems to be a
>>         similar shared LVM
>>         >>>>>>>>>> module in the nebula upstream which we could not
>>         get to work yet,
>>         >>>>>>>>>> but
>>         >>>>>>>>>> did not try much.
>>         >>>>>>>>>>
>>         >>>>>>>>>> The plus side of this setup is that you can make
>>         live migration
>>         >>>>>>>>>> work
>>         >>>>>>>>>> nicely. There are two points to consider however:
>>         once you set the
>>         >>>>>>>>>> LVM
>>         >>>>>>>>>> metadata read-only you wont be able to modify the
>>         local LVMs in
>>         >>>>>>>>>> your
>>         >>>>>>>>>> servers, if there are any. Also, in older kernels,
>>         when you
>>         >>>>>>>>>> modified
>>         >>>>>>>>>> the LVM on one machine the others did not get
>>         notified about the
>>         >>>>>>>>>> changes, so you had to issue an lvs command.
>>         However in new
>>         >>>>>>>>>> kernels
>>         >>>>>>>>>> this issue seems to be solved, the LVs get
>>         instantly updated. I
>>         >>>>>>>>>> don't
>>         >>>>>>>>>> know when and what exactly changed though.
>>         >>>>>>>>>>
>>         >>>>>>>>>> Cheers
>>         >>>>>>>>>> Mihály Héder
>>         >>>>>>>>>> MTA SZTAKI ITAK
>>         >>>>>>>>>>
>>         >>>>>>>>>> On 18 January 2013 08:57, Miloš
>>         Kozák<milos.kozak at lejmr.com <mailto:milos.kozak at lejmr.com>>
>>         >>>>>>>>>> wrote:
>>         >>>>>>>>>>>
>>         >>>>>>>>>>> Hi, I am setting up a small installation of
>>         opennebula with
>>         >>>>>>>>>>> sharedstorage
>>         >>>>>>>>>>> using iSCSI. THe storage is Equilogic EMC with
>>         two controllers.
>>         >>>>>>>>>>> Nowadays
>>         >>>>>>>>>>> we
>>         >>>>>>>>>>> have only two host servers so we use backed
>>         direct connection
>>         >>>>>>>>>>> between
>>         >>>>>>>>>>> storage and each server, see attachment. For this
>>         purpose we set
>>         >>>>>>>>>>> up
>>         >>>>>>>>>>> dm-multipath. Cause in the future we want to add
>>         other servers
>>         >>>>>>>>>>> and
>>         >>>>>>>>>>> some
>>         >>>>>>>>>>> other technology will be necessary in the network
>>         segment.
>>         >>>>>>>>>>> Thesedays
>>         >>>>>>>>>>> we
>>         >>>>>>>>>>> try
>>         >>>>>>>>>>> to make it as same as possible with future
>>         topology from
>>         >>>>>>>>>>> protocols
>>         >>>>>>>>>>> point
>>         >>>>>>>>>>> of
>>         >>>>>>>>>>> view.
>>         >>>>>>>>>>>
>>         >>>>>>>>>>> My question is related to the way how to define
>>         datastore, which
>>         >>>>>>>>>>> driver
>>         >>>>>>>>>>> and
>>         >>>>>>>>>>> TM is the best and which?
>>         >>>>>>>>>>>
>>         >>>>>>>>>>> My primal objective is to avoid GFS2 or any other
>>         cluster
>>         >>>>>>>>>>> filesystem
>>         >>>>>>>>>>> I
>>         >>>>>>>>>>> would
>>         >>>>>>>>>>> prefer to keep datastore as block devices. Only
>>         option I see is
>>         >>>>>>>>>>> to
>>         >>>>>>>>>>> use
>>         >>>>>>>>>>> LVM
>>         >>>>>>>>>>> but I worry about concurent writes isn't it a
>>         problem? I was
>>         >>>>>>>>>>> googling
>>         >>>>>>>>>>> a
>>         >>>>>>>>>>> bit
>>         >>>>>>>>>>> and I found I would need to set up clvm - is it
>>         really necessary?
>>         >>>>>>>>>>>
>>         >>>>>>>>>>> Or is better to use iSCSI driver, drop the
>>         dm-multipath and hope?
>>         >>>>>>>>>>>
>>         >>>>>>>>>>> Thanks, Milos
>>         >>>>>>>>>>>
>>         >>>>>>>>>>> _______________________________________________
>>         >>>>>>>>>>> Users mailing list
>>         >>>>>>>>>>> Users at lists.opennebula.org
>>         <mailto:Users at lists.opennebula.org>
>>         >>>>>>>>>>>
>>         http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>         >>>>>>>>>>>
>>         >>>>>>> _______________________________________________
>>         >>>>>>> Users mailing list
>>         >>>>>>> Users at lists.opennebula.org
>>         <mailto:Users at lists.opennebula.org>
>>         >>>>>>>
>>         http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>         >>>>>
>>         >>>>>
>>         >
>>         _______________________________________________
>>         Users mailing list
>>         Users at lists.opennebula.org <mailto:Users at lists.opennebula.org>
>>         http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>>
>
>
>     _______________________________________________
>     Users mailing list
>     Users at lists.opennebula.org <mailto:Users at lists.opennebula.org>
>     http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20130130/5a18b4c3/attachment-0002.htm>


More information about the Users mailing list