Re: Snapshot behavior on classic LVM vs ThinLVM

Gionatan Danti <g.danti@assyoma.it> · Sat, 08 Apr 2017 13:56:50 +0200

Il 08-04-2017 00:24 Mark Mielke ha scritto:

We use lvmthin in many areas... from Docker's dm-thinp driver, to XFS
file systems for PostgreSQL or other data that need multiple
snapshots, including point-in-time backup of certain snapshots. Then,
multiple sizes. I don't know that we have 8 TB anywhere right this
second, but we are using it in a variety of ranges from 20 GB to 4 TB.

Very interesting, this is the exact information I hoped to get. Thank 
you for reporting.

When you say "nightly", my experience is that processes are writing
data all of the time. If the backup takes 30 minutes to complete, then
this is 30 minutes of writes that get accumulated, and subsequent
performance overhead of these writes.

But, we usually keep multiple hourly snapshots and multiply daily
snapshots, because we want the option to recover to different points
in time. With the classic LVM snapshot capability, I believe this is
essentially non-functional. While it can work with "1 short lived
snapshot", I don't think it works at all well for "3 hourly + 3 daily
snapshots".  Remember that each write to an area will require that
area to be replicated multiple times under classic LVM snapshots,
before the original write can be completed. Every additional snapshot
is an additional cost.

Right. For such a setup, classic LVM snapshot overhead would be 
enormous, grinding all to an halt.

I more concerned about lenghtly snapshot activation due to a big,
linear CoW table that must be read completely...

I suspect this is a pre-optimization concern, in that you are
concerned, and you are theorizing about impact, but perhaps you
haven't measured it yourself, and if you did, you would find there was
no reason to be concerned. :-)

For classic (non-thinly provided) LVM snapshot, relatively big metadata 
size was a know problem. Many talks happened on this list for this very 
topic. Basically, when the snapshot metadata size increased above a 
certain point (measured in some GB), snapshot activation failed due to 
timeout on LVM commands. This, in turn, was due that legacy snapshot 
behavior was not really tuned for long-lived, multi-gigabyte snapshots, 
rather for create-backup-remove behavior.

If you absolutely need a contiguous sequence of blocks for your
drives, because your I/O patterns benefit from this, or because your
hardware has poor seek performance (such as, perhaps a tape drive? :-)
), then classic LVM snapshots would retain this ordering for the live
copy, and the snapshot could be as short lived as possible to minimize
overhead to only that time period.

But, in practice - I think the LVM authors of the thinpool solution
selected a default block size that would exhibit good behaviour on
most common storage solutions. You can adjust it, but in most cases I
think I don't bother, and just use the default. There is also the
behaviour of the systems in general to take into account in that even
if you had a purely contiguous sequence of blocks, your file system
probably allocates files all over the drive anyways. With XFS, I
believe they do this for concurrency, in that two different kernel
threads can allocate new files without blocking each other, because
they schedule the writes to two different areas of the disk, with
separate inode tables.

So, I don't believe the contiguous sequence of blocks is normally a
real thing. Perhaps a security camera that is recording a 1+ TB video
stream might allocate contiguous, but basically nothing else does
this.

True.

To me, LVM thin volumes is the right answer to this problem. It's not
particularly new or novel either. Most "Enterprise" level storage
systems have had this capability for many years. At work, we use
NetApp and they take this to another level with their WAFL =
Write-Anywhere-File-Layout. For our private cloud solution based upon
NetApp AFF 8080EX today, we have disk shelves filled with flash
drives, and NetApp is writing everything "forwards", which extends the
life of the flash drives, and allows us to keep many snapshots of the
data. But, it doesn't have to be flash to take advantage of this. We
also have large NetApp FAS 8080EX or 8060 with all spindles, including
3.5" SATA disks. I was very happy to see this type of technology make
it back into LVM. I think this breathed new life into LVM, and made it
a practical solution for many new use cases beyond being just a more
flexible partition manager.

--

Mark Mielke <mark.mielke@gmail.com>

Yeah, CoW-enabled filesystem are really cool ;) Too bad BTRFS has very 
low performance when used as VM backing store...

Thank you very much Mark, I really appreciate the information you 
provided.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/