Directory Snapshots

Published September 15th, 2008, updated March 17th, 2013.

Creating backups of open files was a challenging endeavor in the past. The main problem here is inconsistency. This is because the original data could get changed while it is read by the backup software. If this happens, it can result in missing references or references pointing at incorrect data. A typical example would be any kind of database. Here we have a lot of indices that are stored aside of the raw data. When the backup software reads the index and an update occurs, the original data cannot be read anymore.

A simple strategy to deal with this type of problem is the creation of offline backups. They are fairly simple to implement and they ensure exclusive access by the backup software. Though, this approach is inelegant as it requires a downtime for every backup. To overcome this limitation, many applications spawned custom backup mechanisms to enable online backups. These mechanisms include simple dumps, logshipping, single- and multi-master replication and others. Although, they enable online backups, they are proprietary and require application specific know-how.

A more generic approach is the use of file system snapshots. They create a static copy of the original data within milliseconds. This copy can be backed up while the system is online and the original data gets changed. On Linux, this snapshot functionality is part of the Logical Volume Manager (LVM). It is included in the standard Kernel since version 2.4.x and most modern Linux distributions activate it in their default installation.

dsnapshot on top of lvm

To create a file system snapshot, one basically requires the file system to be on a logical volume (LV) and some free space on the underlying volume group (VG). Then, one can tell the logical volume manager to snapshot the particular volume. The LVM holds all IOs and creates a new copy-on-write (COW) volume within milliseconds. This new snapshot volume can be mounted and backed up safely, as it does not change when it is written to the original volume.

Of course, there are good reasons to use the backup methods that are recommended by some software vendor. But there are also many situations where a generic approach is preferable. Think of small databases, virtual machines, mail servers and other applications that store custom index files. I’ve seen a lot of these situations in the wild and for many times, I desired to have an easy snapshotting functionality. This lead to various snapshotting scripts, consolidation, adoptions and so on. After all, I’ve created dsnapshot which I want to introduce here.

The dsnapshot script provides a high-level interface to the Linux Logical Volume Manager. It uses its block-level snapshot support to create directory snapshots. In contrast to block-level snapshots, directory snapshots resemble the file system layer. Thus, you can snapshot any directory that is on a logical volume and you don’t have to worry about the actual logical volumes, mount points and paths.

This is the actual syntax for creating…

$ dsnapshot --create /srv/mysql/test/

… and removing a directory snapshot.

$ dsnapshot --remove /var/lib/dsnapshot/srv-fdf2e6dc/mysql/test/

I’ve found this script very handy when you need to backup single directories instead of whole volumes.

github repo