Monday, November 23, 2015

DRBD 8.3 Active/passive mode for NFS and MySQL services

DRBD Definition:
It is a Distributed Replicated Block Device which is software based, shares nothing, replicated storage solution, mirroring the content of the block devices between the hosts.
For example: Hard Disks, Partitions and Logical Volumes.

Why DRBD is Important?
  1. Real Time: Replication occurs continuously while application modifies the data on the service.
  2. Transparent: Applications need not to be aware that data is stored in multiple hosts.
  3. Synchronous/Asynchronous: In synchronous mirroring, a writing application is notified of write completion only after the write has been carried out on both computers. In asynchronous mirroring, the writing application is modified of write completion when the write has completed locally.

Pros and Cons of DRBD:
Pros: DRBD constitutes a driver for virtual block device. It is situated bottom of I/O stack. Because of this, DRBD is flexible and versatile which makes suitable for adding high availability to any applications.

Cons: DRBD cannot auto-detect file system corruption or cannot add active-active clustering capability to file systems like ext3 or XFS.

User Administration Tools:
These are the tools which communicates with kernel module in order to configure and administrator DRBD resources. There are 3 tools, namely
  1. drbdadm: It is a high level administration tool where all the configuration is setup in vi /etc/drbd.conf  file. It also acts as front end for drbdsetup and drbdmeta.

The high-level administration tool of the DRBD program suite. Obtains all DRBD configuration parameters from the configuration file /etc/drbd.conf and acts as a front-end for drbdsetup and drbdmeta. drbdadm has a dry-run mode, invoked with the 
-d option, that shows which drbdsetup and drbdmeta calls drbdadm would issue without actually calling those commands

  1. Drbdsetup: It is a low level tool in drbd. The program that allows users to configure DRBD module that has been loaded into running kernel.
drbdsetup. Configures the DRBD module loaded into the kernel. All parameters to drbdsetup must be passed on the command line. The separation between drbdadm and drbdsetup allows for maximum flexibility. Most users will rarely need to use drbdsetup directly, if at all.

  1. Drbdmeta: It allows users to create, dump, modify and restore metadata structures.

Allows to create, dump, restore, and modify DRBD meta data structures. Like drbdsetup, most users will only rarely need to use drbdmeta directly.

Replication modes
DRBD supports three distinct replication modes, allowing three degrees of replication synchronicity.
Protocol A: Asynchronous replication protocol: Local write operations on the primary node are considered completed as soon as the local disk write has finished, and the replication packet has been placed in the local TCP send buffer. In the event of forced fail-over, data loss may occur. The data on the standby node is consistent after fail-over; however, the most recent updates performed prior to the crash could be lost. Protocol A is most often used in long distance replication scenarios. When used in combination with DRBD Proxy it makes an effective disaster recovery solution.
Protocol B: Memory synchronous (semi-synchronous) replication protocol: Local write operations on the primary node are considered completed as soon as the local disk write has occurred, and the replication packet has reached the peer node. Normally, no writes are lost in case of forced fail-over. However, in the event of simultaneous power failure on both nodes and concurrent, irreversible destruction of the primary’s data store, the most recent writes completed on the primary may be lost.
Protocol C: Synchronous replication protocol: Local write operations on the primary node are considered completed only after both the local and the remote disk write have been confirmed. As a result, loss of a single node is guaranteed not to lead to any data loss. Data loss is, of course, inevitable even with this replication protocol if both nodes (and their storage subsystems) are irreversibly destroyed at the same time.

Preparing the servers

My local server setup on Centos 6:
Server1 = (
Server2 = (
Virtual IP for cluster =

After we have the OS installed we will disable SELINUX and IPTABLES and get CentOS /Redhat up to date by YUM UPDATE. We can activate IPTABLES and SELINUX back again but for now we disabled it to make things easier to troubleshoot in case we need.
#yum update  -y
#iptables –F
#chkconfig iptables off
#lokkit –selinux=disabled
#vim /etc/sysconfig/selinux (change here)

Assuming that the network and hostname has been setup during the OS installation. If you haven’t done that yet it is probably a good time.

# vim /etc/sysconfig/network-scripts/ifcfg-eth0
# ifconfig

#vim /etc/sysconfig/network-scripts/ifcfg-eth0

We need to make sure that NTPD is always running to keep both servers time synchronized.
# date
# ntpdate ntpserver/IP



 Don’t forget to do the host entry in /etc/hosts file in both the systems:
# vim /etc/hosts

Copy /etc/hosts file on db2.
# scp  /etc/hosts

On Both servers:

We need to install gcc, make and other dependencies in case we need to compile any code which is not available in rpm packages.

# yum install gcc glibc make flex OpenIPMI net-snmp python

Okay now to take affect some of the configurations that we have done we will reboot the servers. This is only necessary because we disabled SELINUX and updated the KERNEL; all the other services could have been restarted without a server reboot.

# reboot

Installation of DRBD Package:
Note: As you probably know RHEL 6 / CentOS 6 does not have DRBD on any of the yum repository. You need a support contract with RHEL to get DRBD (they partnered with LINBIT after it was decided to not to support DRBD in EL6 because DRBD didn't get into the mainline kernel until 2.6.33, and EL6 has 2.6.32. However you can alternatively install it from source or rpm packages or use the yum ELRepo repository (
You can download the source file from below given command

# wget

Install ELrepo package and configure it for DRBD installation.

# vi  /etc/yum.repos.d/elrepo.repo

# yum --enablerepo=elrepo install drbd83-utils kmod-drbd83

Note: For installing the packages from elrepo, system should be register with RHN for dependencies.

Let’s now set up our DRBD device. We will create a 20GB LVM to be used by DRBD.
Create the Logical Volume on both servers: (In my case I am using KVM so disk is /dev/vda)
[root@db1~]# pvcreate /dev/vda5
[root@db2~]# pvcreate /dev/vda5

[root@db1~]# vgcreate vg01  /dev/vda5
[root@db2~]# vgcreate vg01  /dev/vda5

[root@db1~]# lvcreate --name  lv01 –-size 10G  vg01
[root@db2~]# lvcreate --name  lv01 --size 10G  vg01


Now save the original global configuration file on both servers:
# mv /etc/drbd.d/global_common.conf /etc/drbd.d/global_common.sample

And create a new file:
# vim /etc/drbd.d/global_common.conf
global { usage-count no; }
common {
syncer { rate 10M; }

 Create a new resource file that I called “main”, on both servers that have to be equal:
# vim /etc/drbd.d/main.res
resource main {
protocol C;
startup { wfc-timeout 0; degr-wfc-timeout 120; }
disk { on-io-error detach; }

on {
device /dev/drbd0;
disk /dev/vg01/lv01;
meta-disk internal;
on {
device /dev/drbd0;
disk /dev/vg01/lv01;
meta-disk internal;

Now we have to create the metadata on both servers:
[root@db1~]# drbdadm create-md main
[root@db2~]# drbdadm create-md main

And we can now start DRBD on both servers at the same time:
[root@db1 drbd.d]# /etc/init.d/drbd start
[root@bd2 drbd.d]# /etc/init.d/drbd start

You have two ways to verify that DRBD is running properly:
# service drbd status
# cat /proc/drbd
Version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by dag@Build64R6, 2011-08-08 08:54:05
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:41941724

As you can see they say that it is connected but “ro:Secondary/Secondary” meaning that we haven’t told the system which one is the Primary server (master) that contains the block to be replicated. Once we tell the system who is the master it will start the synchronization.

We will tell DRBD that server is the Primary server, Run the below command on the Server which you want to make primary:

# drbdsetup /dev/drbd0 primary –o
# drbdadm -- --overwrite-data-of-peer primary main

Note: ‘main’ is your resource name

# cat /proc/drbd
Version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by dag@Build64R6, 2011-08-08 08:54:05
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:503808 nr:0 dw:0 dr:504472 al:0 bm:30 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:41437916
[>....................] sync'ed: 1.3% (40464/40956)M
Finish: 1:06:40 speed: 10,340 (10,280) K/sec

This output has so much information about the connections and specifications:
  1. CS:       Describes about the status of the connection state.
  2. ro:        Describes about the roles of nodes. It displays the local 1st node followed 2nd.
  3. Ds:       It is a disk states which shows the state of the disk.
  4. Ns:       Volume of net data sent to the partner via network connection
  5. nr:        Volume of net data received by the partner via network connection
  6. dw:      Net data written on hard disk ( disk write )
  7. dr:        Net data read from local hard disk ( disk read)
  8. al:        Activity log. Number of updates of meta data
  9. lo:        Number of open requests to the local I/O subsystem by drbd.
  10. Pe:       Number of requests send to partner but not answered.
  11. Ua:       Unacknowledged: number of requests recieved by partner but not answered
  12. ap:       Application pending: Number of I/O req forwarded to drbd, but not answered.

The synchronization started and it will take a little while to be completed. Please wait until it is done or for below output and move to the next step.

# cat /proc/drbd
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:41941724 nr:0 dw:0 dr:41942388 al:0 bm:2560 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Now that our servers are in sync we can format our /dev/drbd0 with our preferred file system. In my case I will use ext4. Run the below command on
# mkfs.ext4 /dev/drbd0

Configuring NFS exports for Heartbeat integration:
Install NFS on server1 and server2. Or check.

Let’s prepare our NFS first:
On both servers:
# mkdir /drbd
# vi /etc/exports
/drbd/main *(rw)

On server1 only :
# mount /dev/drbd0 /drbd
# mkdir /drbd/main

NFS stores some information about your NFS mounts at /var/lib/nfs and since that information will have to be mirrored we will have to move them to the DRBD device:

# mv /var/lib/nfs/ /drbd/

This might give some errors such as:

but do not worry about it because it will create the directories anyways.

# mv /var/lib/nfs /var/lib/nfsBackup

Then symlink /var/lib/nfs to our /drbd directory:
# ln -s /drbd/nfs/ /var/lib/nfs
# umount /drbd

On server2 only:
# mv /var/lib/nfs/ /var/lib/nfsBackup
# ln -s /drbd/nfs/ /var/lib/nfs

The symbolic link will be broken since the /dev/drbd0 is not mounted. This will work in case of NFS fail-over

Heartbeat installation and configuration:
Last but not least configure the heartbeat to control a Virtual IP address and failover NFS in the case of a node failure.
We will install heartbeat from the EPEL repository:

# rpm -Uvh
# vi /etc/yum.repos.d/epel.repo
# yum --enablerepo=epel install heartbeat*

Check on both servers:
# rpm –qa heartbeat*

Create the following file on both servers with the exact same content:

# vi /etc/ha.d/
keepalive 2
deadtime 30
bcast eth0

# same thing here for the node names use the same hostname of your hosts… this needs to be whatever uname –n answers.

The next setp is to create the resource file for heartbeat on both servers with exact same content again:

# vi /etc/ha.d/haresources IPaddr:: drbddisk::main Filesystem::/dev/drbd0::/drbd::ext4 nfslock  nfs

Note: This is just in 1 line.

# first word is the hostname of the primary server then the IP is the one I choose to be the virtual IP to be moved to the slave in case of a failure.

The last thing is to create the authentication file on both servers again with the same content:

# vi /etc/ha.d/authkeys
auth 3
3 md5 redhat

This password file should only be readable by the root user:
# chmod 600 /etc/ha.d/authkeys

Copy all file from db1 to db2 server.
# cd /etc/ha.d/
# scp    db2:/etc/ha.d/
# scp  authkeys  db2:/etc/ha.d/
# scp  haresources  db2:/etc/ha.d/

Testing the Cluster:
On both servers start heartbeat service:
# service heartbeat start

On server1;
[root@db1 ~]# ifconfig
[root@db1 ~]# df  -h

Let’s test the DRBD/NFS fail over now.
You can shut down server1… not a problem or simply stop heartbeat service:

On server1:
# service heartbeat stop

On server2:
[root@db2 ~]# ifconfig
[root@db2 ~]# df –h

See the Difference in case of failover.

Configure MySQL for heartbeat integration:

Before make changes in servers, stop the heartbeat on both sites for safe.
Install mysql, mysql-server on both server
# yum   install mysql mysql-server –y

# yum   install mysql mysql-server –y

Now make directory /drbd/mysql on server1
# mkdir /drbd/mysql

Change Owner ship of mysql directory
# chown mysql:mysql  /drbd/mysql

Do the changes in mysql configuration file /etc/my.cnf   on both servers.

Change data directory path

Copy /etc/my.cnf file from db1 to db2
# scp /etc/my.cnf  db:/etc/

Heartbeat setup for MySQL:
Note: - Leave /etc/ha.d/  and /etc/ha.d/authkeys as it is, as we configure earlier on both server.

Now make entry for heartbeat configuration in /etc/ha.d/haresources file on both servers.
#vim /etc/ha.d/haresources

Copy file from db1 to db2 server
# scp /etc/ha.d/haresources db:/etc/ha.d/

Testing of cluster:
Start heartbeat service on both the servers.

#service heartbeat start
#chkconfig heartbeat on

On server 1
#/etc/init.d/mysqld status

Check Failover:
Shutdown server1 or stop heartbeat service on server1
#/etc/init.d/heatbeat stop

On server 2:
Check mysql status should be running
#/etc/init.d/mysqld status

Reference document:


