[one-users] Migration issue(s) from 3.7 to 4.2

Tue Aug 27 09:19:03 PDT 2013

Hi,

Something did not work in the migration process...
The HOST/VMS element you mention should have been added by this file [1],
and your VM xml is missing the USER_TEMPLATE element, which is added here
[2].

Can you compare the contents of your migrator files
in /usr/lib/one/ruby/onedb/ to the repo [3]?

Regards

[1]
http://dev.opennebula.org/projects/opennebula/repository/revisions/master/entry/src/onedb/3.8.0_to_3.8.1.rb#L92
[2]
http://dev.opennebula.org/projects/opennebula/repository/revisions/master/entry/src/onedb/3.8.4_to_3.9.80.rb#L410
[3]
http://dev.opennebula.org/projects/opennebula/repository/revisions/master/show/src/onedb

--
Join us at OpenNebulaConf2013 <http://opennebulaconf.com> in Berlin, 24-26
September, 2013
--
Carlos Martín, MSc
Project Engineer
OpenNebula - The Open-source Solution for Data Center Virtualization
www.OpenNebula.org | cmartin at opennebula.org |
@OpenNebula<http://twitter.com/opennebula><cmartin at opennebula.org>

On Mon, Aug 26, 2013 at 2:53 PM, Federico Zani
<federico.zani at roma2.infn.it>wrote:

>  Hi Carlos,
>    the problem is that I can't even get the xml of the vms.
> It seems it's something related to how the xml in the "body" column (for
> both hosts and vms) of the database is structured.
>
> Looking deeply in the migrations scripts, I solved the hosts problem by
> adding the <vms> node (even without child) under the <host> tag of the body
> column in "host_pool" table, but for the vms I still have to find a
> solution.
>
> Now with hosts access I'm able to submit and control new vm instances, but
> I have dozens of running vms that I'm not even able to destroy (not even
> with the force switch turned on).
>
> This is the xml of one my hosts, as returned by onehost show -x (relevant
> names are remmed out via the "[...]" string) :
>
> <HOST>
>   <ID>15</ID>
>   <NAME>[...]</NAME>
>   <STATE>2</STATE>
>   <IM_MAD>im_kvm</IM_MAD>
>   <VM_MAD>vmm_kvm</VM_MAD>
>   <VN_MAD>dummy</VN_MAD>
>   <LAST_MON_TIME>1377520947</LAST_MON_TIME>
>   <CLUSTER_ID>101</CLUSTER_ID>
>   <CLUSTER>[...]</CLUSTER>
>   <HOST_SHARE>
>     <DISK_USAGE>0</DISK_USAGE>
>     <MEM_USAGE>20971520</MEM_USAGE>
>     <CPU_USAGE>1800</CPU_USAGE>
>     <MAX_DISK>0</MAX_DISK>
>     <MAX_MEM>24596936</MAX_MEM>
>     <MAX_CPU>2400</MAX_CPU>
>     <FREE_DISK>0</FREE_DISK>
>     <FREE_MEM>5558100</FREE_MEM>
>     <FREE_CPU>2323</FREE_CPU>
>     <USED_DISK>0</USED_DISK>
>     <USED_MEM>19038836</USED_MEM>
>     <USED_CPU>76</USED_CPU>
>     <RUNNING_VMS>6</RUNNING_VMS>
>   </HOST_SHARE>
>   <VMS>
>     <ID>326</ID>
>   </VMS>
>   <TEMPLATE>
>     <ARCH><![CDATA[x86_64]]></ARCH>
>     <CPUSPEED><![CDATA[1600]]></CPUSPEED>
>     <FREECPU><![CDATA[2323.2]]></FREECPU>
>     <FREEMEMORY><![CDATA[5558100]]></FREEMEMORY>
>     <HOSTNAME><![CDATA[[...]]]></HOSTNAME>
>     <HYPERVISOR><![CDATA[kvm]]></HYPERVISOR>
>     <MODELNAME><![CDATA[Intel(R) Xeon(R) CPU           E5645  @
> 2.40GHz]]></MODELNAME>
>     <NETRX><![CDATA[16007208117863]]></NETRX>
>     <NETTX><![CDATA[1185926401588]]></NETTX>
>     <TOTALCPU><![CDATA[2400]]></TOTALCPU>
>     <TOTALMEMORY><![CDATA[24596936]]></TOTALMEMORY>
>     <TOTAL_ZOMBIES><![CDATA[5]]></TOTAL_ZOMBIES>
>     <USEDCPU><![CDATA[76.8000000000002]]></USEDCPU>
>     <USEDMEMORY><![CDATA[19038836]]></USEDMEMORY>
>     <ZOMBIES><![CDATA[one-324, one-283, one-314, one-317,
> one-304]]></ZOMBIES>
>   </TEMPLATE>
> </HOST>
>
> As you can see, every hosts now recognize the connected vms as "zombies",
> probably because he can't query the vms.
>
> I'm also sending you the xml contained in the "body" column of the vm_pool
> table of a vm I can't query with onevm show :
>
> <VM>
>    <ID>324</ID>
>    <UID>0</UID>
>    <GID>0</GID>
>    <UNAME>oneadmin</UNAME>
>    <GNAME>oneadmin</GNAME>
>    <NAME>[...]</NAME>
>    <PERMISSIONS>
>       <OWNER_U>1</OWNER_U>
>       <OWNER_M>1</OWNER_M>
>       <OWNER_A>0</OWNER_A>
>       <GROUP_U>0</GROUP_U>
>       <GROUP_M>0</GROUP_M>
>       <GROUP_A>0</GROUP_A>
>       <OTHER_U>0</OTHER_U>
>       <OTHER_M>0</OTHER_M>
>       <OTHER_A>0</OTHER_A>
>    </PERMISSIONS>
>    <LAST_POLL>1375778872</LAST_POLL>
>    <STATE>3</STATE>
>    <LCM_STATE>3</LCM_STATE>
>    <RESCHED>0</RESCHED>
>    <STIME>1375457045</STIME>
>    <ETIME>0</ETIME>
>    <DEPLOY_ID>one-324</DEPLOY_ID>
>    <MEMORY>4194304</MEMORY>
>    <CPU>9</CPU>
>    <NET_TX>432290511</NET_TX>
>    <NET_RX>2072231827</NET_RX>
>    <TEMPLATE>
>       <CONTEXT>
>          <ETH0_DNS><![CDATA[[...]]]></ETH0_DNS>
>          <ETH0_GATEWAY><![CDATA[[...]]]></ETH0_GATEWAY>
>          <ETH0_IP><![CDATA[[...]]]></ETH0_IP>
>          <ETH0_MASK><![CDATA[[...]]]></ETH0_MASK>
>          <FILES><![CDATA[[...]]]></FILES>
>          <HOSTNAME><![CDATA[[...]]]></HOSTNAME>
>          <TARGET><![CDATA[hdb]]></TARGET>
>       </CONTEXT>
>       <CPU><![CDATA[4]]></CPU>
>       <DISK>
>          <CLONE><![CDATA[YES]]></CLONE>
>          <CLUSTER_ID><![CDATA[101]]></CLUSTER_ID>
>          <DATASTORE><![CDATA[nonshared_ds]]></DATASTORE>
>          <DATASTORE_ID><![CDATA[101]]></DATASTORE_ID>
>          <DEV_PREFIX><![CDATA[hd]]></DEV_PREFIX>
>          <DISK_ID><![CDATA[0]]></DISK_ID>
>          <IMAGE><![CDATA[[...]]]></IMAGE>
>          <IMAGE_ID><![CDATA[119]]></IMAGE_ID>
>          <IMAGE_UNAME><![CDATA[oneadmin]]></IMAGE_UNAME>
>          <READONLY><![CDATA[NO]]></READONLY>
>          <SAVE><![CDATA[NO]]></SAVE>
>
> <SOURCE><![CDATA[/var/lib/one/datastores/101/3860dfcd1bec39ce672ba855564b44ca]]></SOURCE>
>          <TARGET><![CDATA[hda]]></TARGET>
>          <TM_MAD><![CDATA[ssh]]></TM_MAD>
>          <TYPE><![CDATA[FILE]]></TYPE>
>       </DISK>
>       <DISK>
>          <DEV_PREFIX><![CDATA[hd]]></DEV_PREFIX>
>          <DISK_ID><![CDATA[1]]></DISK_ID>
>          <FORMAT><![CDATA[ext3]]></FORMAT>
>          <SIZE><![CDATA[26000]]></SIZE>
>          <TARGET><![CDATA[hdc]]></TARGET>
>          <TYPE><![CDATA[fs]]></TYPE>
>       </DISK>
>       <DISK>
>          <DEV_PREFIX><![CDATA[hd]]></DEV_PREFIX>
>          <DISK_ID><![CDATA[2]]></DISK_ID>
>          <SIZE><![CDATA[8192]]></SIZE>
>          <TARGET><![CDATA[hdd]]></TARGET>
>          <TYPE><![CDATA[swap]]></TYPE>
>       </DISK>
>       <FEATURES>
>          <ACPI><![CDATA[yes]]></ACPI>
>       </FEATURES>
>       <GRAPHICS>
>          <KEYMAP><![CDATA[it]]></KEYMAP>
>          <LISTEN><![CDATA[0.0.0.0]]></LISTEN>
>          <PORT><![CDATA[6224]]></PORT>
>          <TYPE><![CDATA[vnc]]></TYPE>
>       </GRAPHICS>
>       <MEMORY><![CDATA[4096]]></MEMORY>
>       <NAME><![CDATA[[...]]]></NAME>
>       <NIC>
>          <BRIDGE><![CDATA[br1]]></BRIDGE>
>          <CLUSTER_ID><![CDATA[101]]></CLUSTER_ID>
>          <IP><![CDATA[[...]]]></IP>
>          <MAC><![CDATA[02:00:c0:a8:1e:02]]></MAC>
>          <MODEL><![CDATA[virtio]]></MODEL>
>          <NETWORK><![CDATA[[...]]]></NETWORK>
>          <NETWORK_ID><![CDATA[9]]></NETWORK_ID>
>          <NETWORK_UNAME><![CDATA[oneadmin]]></NETWORK_UNAME>
>          <VLAN><![CDATA[NO]]></VLAN>
>       </NIC>
>       <OS>
>          <ARCH><![CDATA[x86_64]]></ARCH>
>          <BOOT><![CDATA[hd]]></BOOT>
>       </OS>
>       <RAW>
>          <TYPE><![CDATA[kvm]]></TYPE>
>       </RAW>
>       <REQUIREMENTS><![CDATA[CLUSTER_ID = 101]]></REQUIREMENTS>
>       <TEMPLATE_ID><![CDATA[38]]></TEMPLATE_ID>
>       <VCPU><![CDATA[4]]></VCPU>
>       <VMID><![CDATA[324]]></VMID>
>    </TEMPLATE>
>    <HISTORY_RECORDS>
>       <HISTORY>
>          <OID>324</OID>
>          <SEQ>0</SEQ>
>          <HOSTNAME>[...]</HOSTNAME>
>          <HID>15</HID>
>          <STIME>1375457063</STIME>
>          <ETIME>0</ETIME>
>          <VMMMAD>vmm_kvm</VMMMAD>
>          <VNMMAD>dummy</VNMMAD>
>          <TMMAD>ssh</TMMAD>
>          <DS_LOCATION>/var/datastore</DS_LOCATION>
>          <DS_ID>102</DS_ID>
>          <PSTIME>1375457063</PSTIME>
>          <PETIME>1375457263</PETIME>
>          <RSTIME>1375457263</RSTIME>
>          <RETIME>0</RETIME>
>          <ESTIME>0</ESTIME>
>          <EETIME>0</EETIME>
>          <REASON>0</REASON>
>       </HISTORY>
>    </HISTORY_RECORDS>
> </VM>
>
> I think it'd be of a great help for me to have the update XSD files for
> all the body columns in the databases: I'd able to validate the xml
> structure of all the tables to highlight migration problems.
>
> Thanks! :)
>
> F.
>
>
> Il 21/08/2013 12:13, Carlos Martín Sánchez ha scritto:
>
> Hi,
>
>  Could you send us the xml of some of the failing vms and hosts? You can
> get it with the -x flag in onevm/host list.
>
>  Send them off-list if you prefer.
>
>  Regards
>
>  --
> Join us at OpenNebulaConf2013 <http://opennebulaconf.com> in Berlin,
> 24-26 September, 2013
> --
> Carlos Martín, MSc
> Project Engineer
> OpenNebula - The Open-source Solution for Data Center Virtualization
> www.OpenNebula.org | cmartin at opennebula.org | @OpenNebula<http://twitter.com/opennebula>
>
>
> On Thu, Aug 8, 2013 at 11:29 AM, Federico Zani <
> federico.zani at roma2.infn.it> wrote:
>
>>  Hi,
>>   I am experiencing some issues after the update from 3.7 to 4.2
>> (frontend on a CentOS 6.4 and hosts with KVM virt manager), this is what I
>> did :
>>
>>  - Stopped one and sunstone and backed up /etc/one
>>  - yum localinstall opennebula-4.2.0-1.x86_64.rpm
>> opennebula-java-4.2.0-1.x86_64.rpm opennebula-ruby-4.2.0-1.x86_64.rpm
>> opennebula-server-4.2.0-1.x86_64.rpm opennebula-sunstone-4.2.0-1.x86_64.rpm
>>  - duplicated im and vmm for kvm mads as specified here
>> http://opennebula.org/documentation:archives:rel4.0:upgrade#driver_names
>>  - checked for other mismatch in one.conf but actually I found nothing to
>> be fixed
>>  - onedb upgrade -v --sqlite /var/lib/one/one.db (no errors, just a few
>> warning about manual fixes needed - that I did)
>>  - moved vm description files from */var/lib/one/*[0-9]* to */
>> var/lib/one/vms/*
>>
>> Then I tried to fsck the sqlite db but got the following error :
>> --------------
>> onedb fsck -f -v -s /var/lib/one/one.db
>> Version read:
>> 4.2.0 : Database migrated from 3.7.80 to 4.2.0 (OpenNebula 4.2.0) by
>> onedb command.
>>
>> Sqlite database backup stored in /var/lib/one/one.db.bck
>> Use 'onedb restore' or copy the file back to restore the DB.
>>   > Running fsck
>>
>> Datastore 0 is missing fom Cluster 101 datastore id list
>> Image 127 is missing fom Datastore 101 image id list
>> undefined method `elements' for nil:NilClass
>> Error running fsck version 4.2.0
>> The database will be restored
>> Sqlite database backup restored in /var/lib/one/one.db
>> -----------
>>
>> I also tried to reinstall ruby gems with /usr/share/one/install_gems but
>> still got the same issue.
>>
>> After a few searching, I  tried to start one and sunstone-server anyway,
>> and this is the result :
>>  - I can do "onevm list" and "onehost list" correctly
>>  - When I do a "onevm show" on a terminated vm it shows me the correct
>> information
>>  - When I do a "onevm show" (on a running vm) or "onehost show", it
>> returns a "[VirtualMachineInfo] Error getting virtual machine [312]." or
>> either "[HostInfo] Error getting host [30]."
>>
>> In the log file (/var/log/oned.log) I can see the following errors, when
>> issuing those commands :
>> ----------
>> Tue Aug  6 12:49:40 2013 [ONE][E]: SQL command was: SELECT body FROM
>> host_pool WHERE oid = 30, error: callback requested query abort
>> Tue Aug  6 12:49:40 2013 [ONE][E]: SQL command was: SELECT body FROM
>> vm_pool WHERE oid = 312, error: callback requested query abort
>> ------------
>>
>> I am still able to see datastores informations and the overall situation
>> of my private cloud through the sunstone dashboard, but it seems I cannot
>> access informations related to running vms and hosts: it leads to an
>> unusable private cloud (can't stop vms, can't run a new one, etc...)
>>
>> Any clues ?
>>
>> Federico.
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opennebula.org
>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20130827/17aa7261/attachment-0002.htm>