Updated August 16th, 2012
I recently had a project that required data storage with deduplication, data integrity assurance, and hardware fault tolerance. Instead of going with a traditional hardware RAID solution I opted to test out ZFS. My configuration utilized a single 120GB SSD drive and four 1TB SATA drives. My solution was to run the main Operating System on the SSD, reserve space for caching on the SSD and utilize ZFS RAID-Z on the four large SATA drives. Here are the steps I took to create such a setup.
Before we get to any of the software configuration, you need to check your server provisioning. I opted to use an OS drive in a six-bay chassis, with an Intel E3 CPU, 16GB of RAM, one 120GB SSD drive, and four spare 1TB drives. The spare drives were NOT part of a RAID. They should be JBOD to allow the Operating System to handle the drives individually.
I provisioned a clean install of CentOS 6.2 x86_64 on the OS drive (120GB SSD) and created a 16GB unused partition I will utilize for caching. I made sure all the spare drives were visible to the OS, installed yum updates, setup yum to auto update and configured a basic firewall. I also disabled Security-Enhanced Linux (SELinux.)
Once you’re at this point, let’s go ahead and login to the server as root. We will now install some basic tools that are required before we proceed with the actual ZFS installation:
yum groupinstall "Development Tools" yum install kernel-devel zlib-devel libuuid-devel libblkid-devel libselinux-devel parted lsscsi nano mdadm bc
You should now take a moment to visit zfsonlinux.org and check for the latest version of SPL and ZFS. I found the versions below to be the latest as of writing this howto. We need to download the two tar balls, extract them and create RPMs. Once the RPMs are created we should proceed with the install:
wget http://github.com/downloads/zfsonlinux/spl/spl-0.6.0-rc10.tar.gz wget http://github.com/downloads/zfsonlinux/zfs/zfs-0.6.0-rc10.tar.gz tar xvzpf spl-0.6.0-rc10.tar.gz tar xvzpf zfs-0.6.0-rc10.tar.gz cd spl-0.6.0-rc10 ./configure make rpm rpm -Uvh *.x86_64.rpm cd .. cd zfs-0.6.0-rc10 ./configure make rpm rpm -Uvh *.x86_64.rpm cd ..
You have now installed SPL and ZFS and can load the ZFS module for use:
modprobe zfs lsmod | grep -i zfs
After running the above commands you should have seen a list of loaded modules from ZFS. You should now make sure the module is loaded persistently on boot. CentOS 6 does this a bit differently than CentOS 5 did. We need to make a new file and add a script to it. Let’s make the new file:
nano -w /etc/sysconfig/modules/zfs.modules
Into this file enter the following code:
#!/bin/sh if [ ! -c /dev/zfs ] ; then exec /sbin/modprobe zfs >/dev/null 2>&1 fi
To save your file press Ctrl+X, then Y, then Enter. Now we need to make this file executable:
chmod +x /etc/sysconfig/modules/zfs.modules
Ok let’s go ahead and do a reboot, to make sure the module comes on auto-magically.
Once the server is back online, see if you get output from:
lsmod | grep -i zfs
Once you’re at this point, if you already know how to use ZFS you can stop reading and start working. Otherwise read on and find out how I made my ZFS setup work for me.
I wanted to use RAID-Z, this is essentially like a standard RAID-5. First I checked that all my drives were online and what their devices were:
[root ~]# fdisk -l | grep GB Disk /dev/sde: 120.0 GB, 120034123776 bytes Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes Disk /dev/sda: 1000.2 GB, 1000204886016 bytes Disk /dev/sdd: 1000.2 GB, 1000204886016 bytes [root ~]#
As you can see my 1TB drives were sda, sdb, sdc and sdd which are the drives I will use in my ZFS pool and I will call that pool ‘storage’ as I’m not that creative. Next I setup the RAID:
zpool create storage raidz -f sda sdb sdc sdd
I now wanted to make sure my ZFS pool called ‘storage’ was indeed created:
[root ~]# zpool status pool: storage state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 sda ONLINE 0 0 0 sdb ONLINE 0 0 0 sdc ONLINE 0 0 0 sdd ONLINE 0 0 0 errors: No known data errors [root ~]#
Now you can also see that your ZFS pool has been mounted as (in my case) /storage with 2.7TB of usable space:
[root ~]# mount | grep zfs storage on /storage type zfs (rw,xattr,context="system_u:object_r:file_t:s0") [root ~]# df -h | grep storage storage 2.7T 0 2.7T 0% /storage [root ~]#
You should add ZFS to auto mount /storage with boot, IF you need it:
echo "zfs mount storage" >> /etc/rc.local
Nice and easy! We could stop here, but I could have done this with a normal RAID controller. What we really need is the features of ZFS that include compression, deduplication and caching. So now we need to enable these features and disable atime for performance.
For SSD caching I’m going to use L2ARC caching since my project focuses on heavy reads and only light writes. If you remember I said my SSD was an OS drive and that I made a separate 16GB partition. We’re going to use this partition now, but first lets verify that we know what it is:
[root ~]# fdisk -l /dev/sde Disk /dev/sde: 120.0 GB, 120034123776 bytes 255 heads, 63 sectors/track, 14593 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0002824e Device Boot Start End Blocks Id System /dev/sde1 * 1 26 204800 83 Linux Partition 1 does not end on cylinder boundary. /dev/sde2 26 2115 16777216 8e Linux LVM /dev/sde3 2115 2376 2097152 82 Linux swap / Solaris /dev/sde4 2376 14594 98140632 5 Extended /dev/sde5 2376 14594 98139136 83 Linux [root ~]#
So we see the LVM partition is /dev/sde2 which is 16GB. So we will now use that as the cache:
zpool add storage cache sde2
Let’s look at the zpool now:
[root ~]# zpool status pool: storage state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 sda ONLINE 0 0 0 sdb ONLINE 0 0 0 sdc ONLINE 0 0 0 sdd ONLINE 0 0 0 cache sde2 ONLINE 0 0 0 errors: No known data errors [root ~]
Now let’s enable the other features we wanted, compression, deduplication and disable atime:
zfs set compression=on storage zfs set dedup=on storage zfs set atime=off storage
Great, we’re pretty much done. One word of caution for those following my guide is that I’m favoring reads over writes and I’m running powerful hardware with enough RAM to handle both compression and deduplication without a serious performance hit. This particular setup is not what I would recommend for say a database server writing and reading data rapidly. Enjoy your new setup!
git clone git://github.com/behlendorf/zfs.git git clone git://github.com/behlendorf/spl.git yum install mdadm bc cd spl ./configure --with-linux=/lib/modules/2.6.32-220.17.1.el6.x86_64/source --with-linux-obj=/lib/modules/2.6.32-220.17.1.el6.x86_64/build make rpm rpm -Uvh *.x86_64.rpm cd .. cd zfs ./configure --with-linux=/lib/modules/2.6.32-220.17.1.el6.x86_64/source --with-linux-obj=/lib/modules/2.6.32-220.17.1.el6.x86_64/build make rpm rpm -Uvh *.x86_64.rpm cd ..