[one-users] TM question about rescheduling a vm
Gary S. Cuozzo
gary at isgsoftware.net
Wed Dec 12 16:54:25 PST 2012
1. I swear I checked the sched.conf, but LIVE_RESCHEDS was 0. Sorry about that. I tested again and it seemed to work fine other than I think I found a bug in my TM scripts. The VM I tried to migrate has 3 disks, so the pre/post scripts got called 3x. Since my scripts iterate through all disks on the first call, I have to add a bit of logic to not failed the next time through just because the filesystem has already been unmounted or mounted.
2. Yep. I have modified the system datastore pre/post scripts to call into my custom TM's. The TM's then go through the template, check for disks they manage, and handle setup/teardown appropriately. It works very well.
Thanks,
gary
----- Original Message -----
From: "Ruben S. Montero" <rsmontero at opennebula.org>
To: "Gary S. Cuozzo" <gary at isgsoftware.net>
Cc: "Users OpenNebula" <users at lists.opennebula.org>
Sent: Wednesday, December 12, 2012 4:44:46 PM
Subject: Re: [one-users] TM question about rescheduling a vm
Hi,
There are two things:
1.- Check that LIVE_RESCHEDS in sched.conf is set to 1, as pointed by Carlos.
2.- The pre/post migrate scripts are those of the system datastore,
you have to copy the ZFS scripts to the system datastore directory.
This may seems to be odd, but note that you are operating over objects in the
system datastore and it may contain disks from different images
datastores, so in general, the pre/post script need to deal with that
situations...
Cheers
Ruben
On Wed, Dec 12, 2012 at 4:51 PM, Gary S. Cuozzo <gary at isgsoftware.net> wrote:
> Odd, that's not the behavior I'm seeing. When I issue a "onevm resched
> <id>" I am seeing the VM get suspended, a checkpoint file created, then the
> scheduler attempts to migrate the VM. The pre/postmigrate scripts never get
> called for my TM driver, my NFS mount points don't get created on the target
> host, and the migration fails. I then have to delete the VM and reschedule
> from scratch.
>
> I just verified this using a VM with a single, persistent, disk in my NFS
> datastore. The flow I see in the logs is:
> 1. VM gets flagged for reschedule
> 2. VM gets saved: "Successfully execute virtualization driver operation:
> save."
> 3. VM gets cleaned: "Successfully execute network driver operation:
> clean."
> 4. ONE looks like it stages the VM on the target host: "Successfully
> execute network driver operation: pre."
> 5. VM fails to restore: "Command execution fail:
> /var/tmp/one/vmm/kvm/restore /var/lib/one//datastores/0/72/checkpoint"
> "Command "virsh --connect qemu:///system restore
> /var/lib/one//datastores/0/72/checkpoint" failed: error: Failed to restore
> domain from /var/lib/one//datastores/0/72/checkpoint"
> "error: Unable to allow access for disk path
> /var/lib/one//datastores/0/72/disk.0: No such file or directory"
>
> At this point, the VM is failed state and I have to resubmit it.
>
> I am able to live migrate this VM just fine and expected that rescheduling
> it should also have done a live migration. For some reason, it is doing a
> plain migration.
>
> Either way, is there a reason the TM pre/post migrate script is not getting
> called as it would for a live migration? It seems like I either have
> something misconfigured, or there is a bug. Either way I would expect the
> pre/postmigrate scripts to be called.
>
>
> Thanks for any help,
> gary
>
> ________________________________
> From: "Carlos Martín Sánchez" <cmartin at opennebula.org>
> To: "Gary S. Cuozzo" <gary at isgsoftware.net>
> Cc: users at lists.opennebula.org
> Sent: Wednesday, December 12, 2012 10:07:12 AM
> Subject: Re: [one-users] TM question about rescheduling a vm
>
>
> Hi,
>
> There is no 'resched' driver action, the command just marks the VM to be
> rescheduled. The scheduler then migrates or live-migrates these VMs
> depending on the LIVE_RESCHED attribute of sched.conf [1]
>
> Regards
>
> [1] http://opennebula.org/documentation:rel3.8:schg
> --
> Carlos Martín, MSc
> Project Engineer
> OpenNebula - The Open-source Solution for Data Center Virtualization
> www.OpenNebula.org | cmartin at opennebula.org | @OpenNebula
>
>
>
> On Tue, Dec 11, 2012 at 8:32 PM, Gary S. Cuozzo <gary at isgsoftware.net>
> wrote:
>>
>> Hello,
>> I have developed my own TM driver for a NFS based datastore using ZFS. It
>> works well so far, except in the case of using the "onevm resched <id>"
>> command.
>>
>> The failure happens because my TM relies on pre/postmigrate scripts to
>> mount/umount ZFS datastores on the hosts as vm's move around. In the case
>> of the resched command, the pre/postmigrate scripts don't get called, so the
>> vm cannot start on the target host because the disk images are not
>> available.
>>
>> In my use case, there isn't much difference between the live migration and
>> the saveas/checkpoint way the resched works. Can the pre/postmigrate
>> scripts be called for the resched? If not, is there some other way I can
>> get them called so I can setup/teardown my nfs mounts?
>>
>> Thanks for any help,
>> gary
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>
--
Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - The Open Source Solution for Data Center Virtualization
www.OpenNebula.org | rsmontero at opennebula.org | @OpenNebula
More information about the Users
mailing list