[one-users] Unable to auto start one and sunstone service in Ubuntu 13.10

kiran ranjane kiran.ranjane at gmail.com
Fri Jan 31 06:57:31 PST 2014


Hi Stefan,

Good question, but never encountered such issue in 3 months, I use a
dedicated private network for Mysql and One which is 172.x.x.x. so my
storage, Kvm nodes and all communicate with 172.x.x.x network. I have
another lan network in 10.x.x.x series which is only used to access
sunstone and used for VM lan network.

I know MySQL replication is the most flexible way to deal with scalability.
If not done right, however, replication can result in disaster.

The most common problem with replication is *primary key collision*.
Primary key collision involves two MySQL servers creating table rows
containing different data, but the same primary key. When this happens
replication stops. With replication stopped, the difference between the
data on the servers grows. At some point the weirdness gets noticed. Then
begins the painful process of recovery, of trying to weave masses of
conflicting data into a whole.

You can overcome this issue by adding the below in mysql config files
while configuring M/M replication

Mysql Server A Config file

auto-increment-increment = 2

auto-increment-offset = 1


Mysql Server B Config file

auto-increment-increment = 2

auto-increment-offset = 2

This situation happens when you use M/M database simultaneously to write
data on both the servers, However in my setup, though it is M/M replication
and both Mysql are in Active/Active mode the write only happens on Server A
unless the fail-over is triggered and services like One and sunstone starts
on Server B with virtual ip up on server B.

So even if the communication between Mysql is stopped the fail-over will
not triggered as it is configured on 10.x.x.x ip and not 172.x.x.x and
server A will still be the master and once the communication are up it will
sync the data with server B without any data collision.

The fail-over process is automatic, Ucarp on server B sends keepalive
request to Server A and when it finds the eth0 and virtual IP down on
server A the script (vip-up.sh) is executed on server B and
assigns virtual IP to it then starts the one and sunstone services.

The fail-back also happens in same way but in reverse order, it stops the
services and then removes the IP from server B and assigns it to server.
Fail-back executes vip-down.sh script.

If you have any questions you can get back to me :)

Thanks and Regards

Kiran Ranjane




On Fri, Jan 31, 2014 at 12:59 PM, Stefan Kooman <stefan at bit.nl> wrote:

> Quoting kiran ranjane (kiran.ranjane at gmail.com):
>
> > I have tested this and it works well. I get only 3 to 4 timeout request
> > when fail-over is triggered so this is quite instant and simple to
> > troubleshoot in-case of issues, No split brain as M/M replication is used
> > and both the database and always in sync, Mostly all the sync is from one
> > side Server A to B because server A is the master node and services are
> > running on one server at a time. Services on Server B starts only after
> > fail-over is triggered so the syncing of data from B to A is always less
> > because it is used as standby cloud storage server.
>
> You state "No split brain as M/M replication is used ...". But what if
> communication between server A and B is lost? Server A becomes master
> and starts one. Server B becomes master and starts one. They both update
> the one database, but replication is not (yet) happening because no
> communication is possible so no issue yet. After communication is
> restored replication continues, and this might work without errors
> (although I have my doubts about that). One master node (B) will proably
> shuts itself down. Master A continues to run. I wonder how healthy the
> one database is after such an incident. Do you do manual fail-overs or
> automatic ones?
>
> Gr. Stefan
>
>
> --
> | BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
> | GPG: 0xD14839C6                   +31 318 648 688 / info at bit.nl
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opennebula.org/pipermail/users-opennebula.org/attachments/20140131/7a1cc4f4/attachment-0002.htm>


More information about the Users mailing list