Using SSDs as a cache on Devuan

Speeding up disk read write access

I’m now running my home server on Devuan for several reasons.

Preferring to own my own data and IT, I have a server to host my information.

Therefore I have several large slow SATA 2TB drives in a Software RAID array. Unfortunately, I’m using RAID5 for now, but hope to move to RAID 10 sometime later.

MDADM, software RAID works great on Linux. However there are times I really need more speed. Backups and other things come to mind. And well, I really just love things to be faster.

One could create another VG, backed by SSDs. That would work, but then one needs to manage what VMs and other data sits on which speed tier of disk.

A solution I prefer would be either bcache (or similar, bcachefs, ZFS, flashcache) or LVM caching.

I decided to go with LVM caching, it sounded simple and couldn’t really decide on why one or the other.

About my setup

This setup allows one big volume group with tiered storage backend, mixed with SSDs and slower drives. Any data which requires faster read/write access will generally just work.

First, the drives I have:

3 x 2TB SATA Drives
2 x 500G SSD Drives

I installed Devuan ascii on the 3 x 2TB SATA drives. The 3 x 2TB SATA drives form two RAID arrays:

RAID 1, with the first 500MB partition on each disk. /boot file system lives here.
RAID 5, with the second partition on each disk. This large RAID group, is the PV backend for one big VG, LV and file system / at about 4TB usable space.

Great, so now the OS is up and running. Lets move onto making this large data store faster.

Setting up LVM cache on Devuan ascii

In the interest of making things even faster, while also doubling the SSD space available, I will create a RAID0 with the two SSDs.

The downside of this, of course, is that if one drive fails, the caching data is lost and this could cause corruption. The persistent data lives on the backend RAID5. SSDs don’t fail that often and this is my home server, so this will work for me.

Creating the RAID 0 array.

mdadm --create --verbose --level=0 --metadata=1.2 --RAID-devices=2 /dev/md/3 /dev/sd[ab]

Next we front our large slow SATA RAID5 array with two SSDs which are now double the speed thanks to the RAID0 array.

vgextend stone_vg0 /dev/md3

lvcreate --type cache --cachemode writeback -l 100%FREE -n root_cachepool stone_vg0/root /dev/md3

Above I’m using writeback and not writethrough. This is better for write performance but at the cost of a higher risk of data loss in case the drive used for cache fails. Which is now greater thanks to the RAID0 array. Ideally I should back the server by a UPS, to protect from power issues, which is a common occurrence in South Africa.

That’s mostly all there is to this setup - quite simple.

To further demonstrate the layout of all:

root@stone:~# pvs
  PV         VG        Fmt  Attr PSize   PFree
  /dev/md1   stone_vg0 lvm2 a--    3.64t    0
  /dev/md127 stone_vg0 lvm2 a--  931.27g    0

root@stone:~# lvs
  LV   VG        Attr       LSize Pool             Origin       Data%  Meta%  Move Log Cpy%Sync Convert
  root stone_vg0 Cwi-aoC--- 3.64t [root_cachepool] [root_corig] 100.00 46.16           0.00

root@stone:~# df -h /
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/stone_vg0-root  3.6T  1.5T  2.2T  40% /

To allow for the system to reboot, one needs to also do the following unexpected steps:

Ensure all you RAID arrays are listed in /etc/mdadm/mdadm.conf.

Then, configure the modules needed for boot:

echo "dm_cache" >> /etc/initramfs-tools/modules
echo "dm_cache_mq" >> /etc/initramfs-tools/modules
echo "dm_persistent_data" >> /etc/initramfs-tools/modules
echo "dm_bufio" >> /etc/initramfs-tools/modules

Ensure thin-provisioning-tools is installed.

Create this script:

cat > /etc/initramfs-tools/hooks/lvmcache << EOF
#!/bin/sh

PREREQ="lvm2"

prereqs()
{
    echo "\$PREREQ"
}

case \$1 in
prereqs)
    prereqs
    exit 0
    ;;
esac

if [ ! -x /usr/sbin/cache_check ]; then
    exit 0
fi

. /usr/share/initramfs-tools/hook-functions

copy_exec /usr/sbin/cache_check

manual_add_modules dm_cache dm_cache_mq
EOF
chmod 0755 /etc/initramfs-tools/hooks/lvmcache

And rebuild the initial ramdisks update-initramfs -u -k all.

System should now be able to reboot successfully.

If you want to stop the cache and undo the work you just lvconvert --uncache stone_vg0/root.

Testing LVM caching speeds

Running the below copy of a 4G large video file:

root@stone:~# rm -f SomeLargeVideo.mkv ; echo 3 > /proc/sys/vm/drop_caches && time cp /data/media/videos/movies/SomeLargeVideo.mkv /root/

real    0m5.640s
user    0m0.024s
sys     0m3.584s

Is consistently giving me about 5-6 seconds to copy the 4G file.

So the speed is:

$ units
  You have: 4 gigabytes / 5s
  You want: megabytes / 1s
          * 800

So we are getting 800MB/s.

When we disable/delete the lvm caching and try again:

  root@stone:~# lvconvert --uncache stone_vg0/root
  Do you really want to remove and DISCARD logical volume stone_vg0/root_cachepool? [y/n]: y
    Flushing 8 blocks for cache stone_vg0/root.
    Logical volume "root_cachepool" successfully removed
    Logical volume stone_vg0/root is not cached.
  root@stone:~# echo $?
  0
root@stone:~# rm -f SomeLargeVideo.mkv ; echo 3 > /proc/sys/vm/drop_caches && time cp /data/media/videos/movies/SomeLargeVideo.mkv /root/

real    0m24.172s
user    0m0.016s
sys     0m4.556s

root@stone:~# rm -f SomeLargeVideo.mkv ; echo 3 > /proc/sys/vm/drop_caches && time cp /data/media/videos/movies/SomeLargeVideo.mkv /root/

real    0m23.462s
user    0m0.028s
sys     0m4.524s

We get about 24s, which is a lot slower.

$ units
You have: 4 gigabytes / 24s
You want: megabytes / 1s
        * 166.66667

So with LVM caching enabled copying a 4GB file transfers at 800MB/s. With it off, it transfers at 166MB/s.

All seems pretty good to me.

One last thing, I’ve noticed if you your frontend cache is unavailable during boot up - say it fails - the system fails to boot. This is not ideal and I’d like to get around to fixing that. However if that happens you simply remove the frontend cache from LVM in a rescue environment and reboot.