My goal by using DRBD as Primary/Primary with GFS is to load balance a http service, my servers looks like the following:
i use the GFS partition as document-root for my webserver (Apache).maybe it’s better to use SAN as storage but it’s so expensive, another solutions maybe iSCSI or GNBD but also it’s need more servers which needs extra money maybe in the future i will implement it using SAN, iSCSI or GNBD but for now it’s good with DRBD and GFS as two nodes with load balancer and it’s fast enough.
for testing and preparing this quick howto i used Xen to create 2 virtual machine and centos 5 as OS. the partition that i want to use as GFS is named xvdb1, make sure that your partition don’t contain any data you want (it will be destroyed)to destroy the partition i used this command in the two nodes:dd if=/dev/zero of=/dev/xvdb1
change /dev/xvdb1 to your partition (make sure it doesn’t contain any needed data).
the following commands have to be done in the two nodes, for simplicity i use the output of one machine* download DRBD on node1 & node2:[root@node1 ~]# mkdir downloads[root@node1 ~]# cd downloads/[root@node1 downloads]# wget -c http://oss.linbit.com/drbd/8.2/drbd-8.2.5.tar.gz
* untar it:[root@node1 downloads]# time tar -xzpf drbd-8.2.5.tar.gz -C /usr/src/
real 0m0.162suser 0m0.016ssys 0m0.028s[root@node1 downloads]# ls /usr/src/drbd-8.2.5 redhat
* before building DRBD:before you start, make sure you have the following installed in your system:- make, gcc, the glibc development libraries, and the flex scanner generator must be installed- kernel-headers and kernel-devel:[root@node1 downloads]# yum list kernel-*Loading “installonlyn” pluginSetting up repositoriesReading repository metadata in from local filesInstalled Packageskernel.i686 2.6.18-8.el5 installedkernel-headers.i386 2.6.18-8.el5 installedkernel-xen.i686 2.6.18-8.el5 installedkernel-xen-devel.i686 2.6.18-8.el5 installedAvailable Packageskernel-PAE.i686 2.6.18-8.el5 localkernel-PAE-devel.i686 2.6.18-8.el5 localkernel-devel.i686 2.6.18-8.el5 localkernel-doc.noarch 2.6.18-8.el5 localremember that i use Xen kernel.
* building DRBD:- building DRBD kernel module:[root@node1 downloads]# cd /usr/src/drbd-8.2.5/drbd[root@node1 drbd]# make clean all...mv .drbd_kernelrelease.new .drbd_kernelreleaseMemorizing module configuration … done.[root@node1 drbd]#
- checking the new kernel module:[root@node1 drbd]# modinfo drbd.kofilename: drbd.koalias: block-major-147-*license: GPLdescription: drbd – Distributed Replicated Block Device v8.2.5author: Philipp Reisner <phil@linbit.com>, Lars Ellenberg <lars@linbit.com>srcversion: E325FBFE020C804C4FABA31depends:vermagic: 2.6.18-8.el5xen SMP mod_unload 686 REGPARM 4KSTACKS gcc-4.1parm: minor_count:Maximum number of drbd devices (1-255) (int)parm: allow_oos:DONT USE! (bool)parm: enable_faults:intparm: fault_rate:intparm: fault_count:intparm: fault_devs:intparm: trace_level:intparm: trace_type:intparm: trace_devs:intparm: usermode_helper:string[root@node1 drbd]#
- Building a DRBD RPM package[root@node1 drbd]# cd /usr/src/drbd-8.2.5/[root@node1 drbd-8.2.5]# make rpm...You have now:-rw-r–r– 1 root root 142722 May 23 11:45 dist/RPMS/i386/drbd-8.2.5-3.i386.rpm-rw-r–r– 1 root root 232238 May 23 11:45 dist/RPMS/i386/drbd-debuginfo-8.2.5-3.i386.rpm-rw-r–r– 1 root root 851602 May 23 11:45 dist/RPMS/i386/drbd-km-2.6.18_8.el5xen-8.2.5-3.i386.rpm[root@node1 drbd-8.2.5]#
- installing DRBD:[root@node1 drbd-8.2.5]# cd dist/RPMS/i386/[root@node1 i386]# rpm -ihv drbd-8.2.5-3.i386.rpm drbd-km-2.6.18_8.el5xen-8.2.5-3.i386.rpmPreparing… ########################################### [100%]1:drbd ########################################### [ 50%]2:drbd-km-2.6.18_8.el5xen########################################### [100%]
* Configuring DRBD:- for lower-level storage i use a simple setup, both hosts have a free (currently unused) partition named /dev/xvdb1 and i use internal meta data.- for /etc/drbd.conf i use this configuration:resource r0 {protocol C;startup {become-primary-on both;}net {allow-two-primaries;cram-hmac-alg “sha1″;shared-secret “123456″;after-sb-0pri discard-least-changes;after-sb-1pri violently-as0p;after-sb-2pri violently-as0p;rr-conflict violently;}syncer {rate 44M;}
on node1.test.lab {device /dev/drbd0;disk /dev/xvdb1;address 192.168.1.1:7789;meta-disk internal;}on node2.test.lab {device /dev/drbd0;disk /dev/xvdb1;address 192.168.1.2:7789;meta-disk internal;}}
note that “become-primary-on both” startup option is needed in Primary/Primary configuration.
* starting DRBD for the first time:the following steps must be performed on the two nodes:- Create device metadata[root@node1 i386]# drbdadm create-md r0v08 Magic number not foundv07 Magic number not foundv07 Magic number not foundv08 Magic number not foundWriting meta data…initialising activity logNOT initialized bitmapNew drbd meta data block sucessfully created.
–== Creating metadata ==–As with nodes we count the total number of devices mirrored by DRBD atat http://usage.drbd.org.
The counter works completely anonymous. A random number gets created forthis device, and that randomer number and the devices size will be sent.
http://usage.drbd.org/cgi-bin/insert_usage.pl?nu=18231616900827588600&ru=15113975333795790860&rs=2147483648
Enter ‘no’ to opt out, or just press [return] to continue:success
- Attach. This step associates the DRBD resource with its backing device:[root@node1 i386]# modprobe drbd[root@node1 i386]# drbdadm attach r0
- verify running DRBD:on node1:[root@node1 i386]# cat /proc/drbdversion: 8.2.5 (api:88/proto:86-88)GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by root@node1.test.lab, 2008-05-23 11:45:230: cs:StandAlone st:Secondary/Unknown ds:Inconsistent/Outdated r—ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
on node2:[root@node2 i386]# cat /proc/drbdversion: 8.2.5 (api:88/proto:86-88)GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by root@node2.test.lab, 2008-05-23 12:58:180: cs:StandAlone st:Secondary/Unknown ds:Inconsistent/Outdated r—ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
- Connect. This step connects the DRBD resource with its counterpart on the peer node:[root@node1 i386]# drbdadm connect r0[root@node1 i386]# cat /proc/drbdversion: 8.2.5 (api:88/proto:86-88)GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by root@node1.test.lab, 2008-05-23 11:45:230: cs:WFConnection st:Secondary/Unknown ds:Inconsistent/Outdated C r—ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
- initial device synchronization for the first time:the following step must done just on one node, i used node1:[root@node1 i386]# drbdadm — –overwrite-data-of-peer primary r0
- verify:
[root@node1 i386]# cat /proc/drbdversion: 8.2.5 (api:88/proto:86-88)GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by root@node1.test.lab, 2008-05-23 11:45:230: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r—ns:792 nr:0 dw:0 dr:792 al:0 bm:0 lo:0 pe:0 ua:0 ap:0[>....................] sync’ed: 0.2% (2096260/2097052)Kfinish: 2:11:00 speed: 264 (264) K/secresync: used:0/31 hits:395 misses:1 starving:0 dirty:0 changed:1act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
[root@node2 i386]# cat /proc/drbdversion: 8.2.5 (api:88/proto:86-88)GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by root@node2.test.lab, 2008-05-23 12:58:180: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate C r—ns:0 nr:1896 dw:1896 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0[>....................] sync’ed: 0.2% (2095156/2097052)Kfinish: 2:02:12 speed: 268 (268) K/secresync: used:0/31 hits:947 misses:1 starving:0 dirty:0 changed:1act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
By now, our DRBD device is fully operational, even before the initial synchronization has completed. we can now continue to configure GFS…
- Configuring your nodes to support GFSbefore we can configure GFS, we need a littel help from RHCS, the following packages is needed to be installed on the systems:- “cman” ( RedHat Cluster Maneger)- “lvm2-cluster” (LVM with Cluster support)- “gfs-utils” or “gfs2-utils” (GFS1 Utils or GFS2 Utils, as write of this document, i prefer GFS1)- “kmod-gfs” or “kmod-gfs-xen” for Xen (GFS kernel module)
* we must enable and start the following system services on both nodes:- cman : it will run ccsd, fenced, dlm and openais.- clvmd.- gfs.
starting cman:before we can start cman, we have to conigure /etc/cluster/cluster.conf i use the following configration:
<?xml version=”1.0″?><cluster name=”my-cluster” config_version=”1″><cman two_node=”1″ expected_votes=”1″></cman><clusternodes><clusternode name=”node1.test.lab” votes=”1″ nodeid=”1″><fence><method name=”single”><device name=”human” ipaddr=”192.168.1.1″/></method></fence></clusternode><clusternode name=”node2.test.lab” votes=”1″ nodeid=”2″><fence><method name=”single”><device name=”human” ipaddr=”192.168.1.2″/></method></fence></clusternode></clusternodes><fence_devices><fence_device name=”human” agent=”fence_manual”/></fence_devices></cluster>
after editing /etc/cluster/cluster.conf we have to start it in the two nodes in the same time:on the node1:[root@node1 i386]# /etc/init.d/cman startStarting cluster:Loading modules… doneMounting configfs… doneStarting ccsd… doneStarting cman… doneStarting daemons… doneStarting fencing… done[ OK ]
on the node2:[root@node2 i386]# /etc/init.d/cman startStarting cluster:Loading modules… doneMounting configfs… doneStarting ccsd… doneStarting cman… doneStarting daemons… doneStarting fencing… done[ OK ]
check nodes:[root@node1 i386]# cman_tool nodesNode Sts Inc Joined Name1 M 4 2008-05-23 14:33:25 node1.test.lab2 M 316 2008-05-23 14:41:34 node2.test.lab
in the ‘Sts’ column the ‘M’ means that every thing is going fine, if it’s ‘X’ then there is a problem happend..
- starting CLVMD:
first we need to change locking type in /etc/lvm/lvm.conf to 3 in the two nodes:vi /etc/lvm/lvm.confchange locking_type = 1 to locking_type = 3we also need to change the filter option to let vgscan don’t see the duplicated PV (duplicate PV will happen because our xvdb1 will be the backend for drbd0) i changed filter like this#filter = [ "a/.*/" ]filter = [ "a|xvda.*|", "a|drbd.*|", "r|xvdb.*|" ]
in my filter option, “a|xvda.*|” means add all xvda partition, “a|drbd.*|” means add all drbd partition, and “r|xvdb.*|” means remove (ignore) all xvdb partition (one of them is our partition which is xvdb1)
save and exit..the first thing to do is vgscan, so it’s read the new configuration:[root@node1 i386]# vgscanReading all physical volumes. This may take a while…Found volume group “VolGroup00″ using metadata type lvm2
- the following commands must done in one node, i used node1 -now create our PV:[root@node1 i386]# pvcreate /dev/drbd0Physical volume “/dev/drbd0″ successfully created
creating our volume group:[root@node1 i386]# vgcreate my-vol /dev/drbd0Volume group “my-vol” successfully created[root@node1 i386]# vgdisplay— Volume group —VG Name my-volSystem IDFormat lvm2Metadata Areas 1Metadata Sequence No 1VG Access read/writeVG Status resizableClustered yesOpen LV 0Max PV 0Cur PV 1Act PV 1VG Size 2.00 GBPE Size 4.00 MBTotal PE 511Alloc PE / Size 0 / 0Free PE / Size 511 / 2.00 GBVG UUID UaUK5v-P3aX-nmCn-Oj3F-XQox-AgxB-UsM0xS
did you noticed Clustered yes?
creating our lv:[root@node1 i386]# lvcreate -L1.9G –name my-lv my-volRounding up size to full physical extent 1.90 GBError locking on node node2.test.lab: device-mapper: reload ioctl failed: Invalid argumentFailed to activate new LV.
creating the GFS:[root@node1 i386]# gfs_mkfs -p lock_dlm -t my-cluster:www -j 2 /dev/my-vol/my-lvThis will destroy any data on /dev/my-vol/my-lv.
Are you sure you want to proceed? [y/n] y
Device: /dev/my-vol/my-lvBlocksize: 4096Filesystem Size: 433092Journals: 2Resource Groups: 8Locking Protocol: lock_dlmLock Table: my-cluster:www
Syncing…All Done
start gfs service:[root@node1 i386]# /etc/init.d/gfs start
mount it on the first node:[root@node1 i386]# mount -t gfs /dev/my-vol/my-lv /www[root@node1 i386]# df -hFilesystem Size Used Avail Use% Mounted on/dev/mapper/VolGroup00-LogVol009.1G 3.4G 5.3G 40% //dev/xvda1 99M 17M 78M 18% /boottmpfs 129M 0 129M 0% /dev/shm/dev/my-vol/my-lv 1.7G 20K 1.7G 1% /www[root@node1 i386]# ls -lth /www/total 0
mount it in the second node:now you have to wait until the initial device synchronization finish, to check:[root@node2 i386]# cat /proc/drbdversion: 8.2.5 (api:88/proto:86-88)GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by root@node2, 2008-05-23 12:58:180: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate C r—ns:0 nr:1970404 dw:1970404 dr:0 al:0 bm:119 lo:0 pe:0 ua:0 ap:0[=================>..] sync’ed: 93.4% (143276/2097052)Kfinish: 0:08:57 speed: 252 (232) K/secresync: used:0/31 hits:976756 misses:120 starving:0 dirty:0 changed:120act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
after it finish we need to change it to primary before we can mount it:[root@node2 i386]# drbdadm primary r0[root@node2 i386]# cat /proc/drbdversion: 8.2.5 (api:88/proto:86-88)GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by root@node2, 2008-05-23 12:58:180: cs:Connected st:Primary/Primary ds:UpToDate/UpToDate C r—ns:0 nr:2113680 dw:2113680 dr:0 al:0 bm:128 lo:0 pe:0 ua:0 ap:0resync: used:0/31 hits:1048386 misses:128 starving:0 dirty:0 changed:128act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
notice “st:Primary/Primary” it’s what we want!
now to check the volume group:[root@node2 ~]# vgscanReading all physical volumes. This may take a while…Found volume group “VolGroup00″ using metadata type lvm2Found volume group “my-vol” using metadata type lvm2
mount it![root@node2 i386]# /etc/init.d/gfs start[root@node2 i386]# mkdir /www[root@node2 i386]# mount -t gfs /dev/my-vol/my-lv /www/sbin/mount.gfs: can’t open /dev/my-vol/my-lv: No such file or directory
oOoPps do you remember the error “Error locking on node node2.test.lab: device-mapper: reload ioctl failed: Invalid argumen” when we created our LV in the first node? ok easy, restart clvmd in node2 and try remounting it:
[root@node2 i386]# /etc/init.d/clvmd restartDeactivating VG my-vol: 0 logical volume(s) in volume group “my-vol” now active[ OK ]Stopping clvm: [ OK ]Starting clvmd: [ OK ]Activating VGs 2 logical volume(s) in volume group “VolGroup00″ now active1 logical volume(s) in volume group “my-vol” now active[ OK ][root@node2 i386]# mount -t gfs /dev/my-vol/my-lv /www
aha, lets touch some data:[root@node2 i386]# touch /www/hi[root@node2 i386]# ls -lth /www/total 8.0K-rw-r–r– 1 root root 0 May 23 16:35 hiand from node1:[root@node1 i386]# ls -lth /www/total 8.0K-rw-r–r– 1 root root 0 May 23 16:35 hi