[one-users] Possible race condition in iSCSI datastore.

Fri Dec 7 14:50:13 PST 2012

Nice!

Ok, I'll mark the issue as related to the iSCSI drivers. Merging the
new iSCSI drivers upstream is probably happening next week...

Cheers

Ruben

On Thu, Dec 6, 2012 at 9:24 PM, Alain Pannetrat
<apannetrat at cloudsecurityalliance.org> wrote:
> Dear Mark, Ruben,
>
> From reading the code from those patches, I think they seem indeed to
> greatly improve the iSCSI driver, and solve my problem.
>
> Tonight, I was also confronted with the issue that in some cases it
> sometimes takes more than two seconds between "iscsiadm_login
> "$NEW_IQN" "$TARGET_HOST"" and "/dev/disk/by-path/*$NEW_IQN-lun-1" in
> the DISCOVERY_CMD inside iscsi/clone, so the "sleep 2" inserted there
> is not enough. This is also fixed by the proposed patches.
> Nice work!
>
> All the best,
>
> Alain
>
>
> On Thu, Dec 6, 2012 at 9:09 PM, Mark Gergely <gergely.mark at sztaki.mta.hu> wrote:
>> Dear Ruben, Alain,
>>
>> our improved iSCSI driver set that we proposed before should solve this issue. As mentioned in the ticket, it is possible to simultaneously start hundreds of non persistent virtual machines.
>> The TM concurrency level is 15.
>> You can check the details at: http://dev.opennebula.org/issues/1592
>>
>> All the best,
>> Mark Gergely
>> MTA-SZTAKI LPDS
>>
>> On 2012.12.06., at 20:01, "Ruben S. Montero" <rsmontero at opennebula.org> wrote:
>>
>>> Hi Alain,
>>>
>>> You are totally right, this may be a problem when instantiated
>>> multiple VMs at the same time.  I've filled an issue to look for the
>>> best way to generate the TID [1].
>>>
>>> We'd be interested in updating the tgtadm_next_tid function in
>>> scripts_common.sh. Also if the tgt server is getting overloaded by
>>> this simultaneous deployments, there are several ways to limit the
>>> concurrency of the TM (e.g. the -t option in oned.conf)
>>>
>>> THANKS for the feedback!
>>>
>>> Ruben
>>>
>>> [1]  http://dev.opennebula.org/issues/1682
>>>
>>> [1] http://dev.opennebula.org/issues/1682
>>>
>>> On Thu, Dec 6, 2012 at 1:52 PM, Alain Pannetrat
>>> <apannetrat at cloudsecurityalliance.org> wrote:
>>>> Hi all,
>>>>
>>>> I'm new to OpenNebula and this mailing list, so forgive me if I
>>>> stumble over a topic that may have already been discussed.
>>>>
>>>> I'm currently discovering opennebula 3.8.1 with a simple 3 node
>>>> system: a control node, a compute node and a datastore node
>>>> (iscsi+lvm).
>>>>
>>>> I have been testing the bulk instantiation of virtual machines in
>>>> sunstone, where I initiate the bulk creation of 8 virtual machines in
>>>> parallel. I have noticed that between 2 and 4 machines just fail to
>>>> instantiate correctly with the typical following error message:
>>>>
>>>> 08 2012 [TM][I]: Command execution fail:
>>>> /var/lib/one/remotes/tm/iscsi/clone
>>>> iqn.2012-02.org.opennebula:san.vg-one.lv-one-26
>>>> compute.admin.lan:/var/lib/one//datastores/0/111/disk.0 111 101
>>>> Thu Dec  6 14:40:08 2012 [TM][E]: clone: Command "    set -e
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: set -x
>>>> Thu Dec  6 14:40:08 2012 [TM][I]:
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: # get size
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: SIZE=$(sudo lvs --noheadings -o
>>>> lv_size "/dev/vg-one/lv-one-26")
>>>> Thu Dec  6 14:40:08 2012 [TM][I]:
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: # create lv
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: sudo lvcreate -L${SIZE} vg-one -n
>>>> lv-one-26-111
>>>> Thu Dec  6 14:40:08 2012 [TM][I]:
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: # clone lv with dd
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: sudo dd if=/dev/vg-one/lv-one-26
>>>> of=/dev/vg-one/lv-one-26-111 bs=64k
>>>> Thu Dec  6 14:40:08 2012 [TM][I]:
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: # new iscsi target
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: TID=$(sudo tgtadm --lld iscsi --op
>>>> show --mode target |             grep "Target" | tail -n 1 |
>>>>  awk '{split($2,tmp,":"); print tmp[1]+1;}')
>>>> Thu Dec  6 14:40:08 2012 [TM][I]:
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: sudo tgtadm --lld iscsi --op new
>>>> --mode target --tid $TID  --targetname
>>>> iqn.2012-02.org.opennebula:san.vg-one.lv-one-26-111
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: sudo tgtadm --lld iscsi --op bind
>>>> --mode target --tid $TID -I ALL
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: sudo tgtadm --lld iscsi --op new
>>>> --mode logicalunit --tid $TID  --lun 1 --backing-store
>>>> /dev/vg-one/lv-one-26-111
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: sudo tgt-admin --dump |sudo tee
>>>> /etc/tgt/targets.conf > /dev/null 2>&1" failed: + sudo lvs
>>>> --noheadings -o lv_size /dev/vg-one/lv-one-26
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: 131072+0 records in
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: 131072+0 records out
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: 8589934592 bytes (8.6 GB) copied,
>>>> 898.903 s, 9.6 MB/s
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: tgtadm: this target already exists
>>>> Thu Dec  6 14:40:08 2012 [TM][E]: Error cloning
>>>> compute.admin.lan:/dev/vg-one/lv-one-26-111
>>>> Thu Dec  6 14:40:08 2012 [TM][I]: ExitCode: 22
>>>> Thu Dec  6 14:40:08 2012 [TM][E]: Error executing image transfer
>>>> script: Error cloning compute.admin.lan:/dev/vg-one/lv-one-26-111
>>>> Thu Dec  6 14:40:09 2012 [DiM][I]: New VM state is FAILED
>>>>
>>>> After adding traces in the code, I found that there seems to be a race
>>>> condition in /var/lib/one/remotes/tm/iscsi/clone here the following
>>>> commands get executed:
>>>>
>>>> TID=\$($SUDO $(tgtadm_next_tid))
>>>> $SUDO $(tgtadm_target_new "\$TID" "$NEW_IQN")
>>>>
>>>> These commands are typically expanded to something like this:
>>>>
>>>> TID=$(sudo tgtadm --lld iscsi --op show --mode target | grep "Target"
>>>> | tail -n 1 | awk '{split($2,tmp,":");
>>>> sudo tgtadm --lld iscsi --op new --mode target --tid $TID
>>>> --targetname iqn.2012-02.org.opennebula:san.vg-one.lv-one-26-111
>>>>
>>>> What seems to happens is two (or more) calls to the first command
>>>> tgtadm_next_tid happen simultaneously before the second command gets a
>>>> chance to get executed, and then TID as the same value for two (or
>>>> more) VMs.
>>>>
>>>> The workaround I found is to replace the line:
>>>> TID=\$($SUDO $(tgtadm_next_tid))
>>>> with
>>>> TID=$VMID
>>>> in /var/lib/one/remotes/tm/iscsi/clone
>>>>
>>>> Since $VMID is globally unique no race conditions can happen here.
>>>> I've tested this and the failures don't happen anymore in my setting.
>>>> Of course I'm not sure this is the ideal fix, since perhaps VMID can
>>>> take values that are out of range for tgtadm. So futher testing would
>>>> be needed.
>>>>
>>>> I'd be happy to get your thoughts/feedback on this issue.
>>>>
>>>> Best,
>>>>
>>>> Alain
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at lists.opennebula.org
>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>>
>>>
>>>
>>> --
>>> Ruben S. Montero, PhD
>>> Project co-Lead and Chief Architect
>>> OpenNebula - The Open Source Solution for Data Center Virtualization
>>> www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.opennebula.org
>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

-- 
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - The Open Source Solution for Data Center Virtualization
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula