Re: Network-attached block storage and local SSDs for dm-cache

Konstantin Ryabitsev <konstantin@xxxxxxxxxxxxxxxxxxx> · Tue, 23 Apr 2019 09:58:58 -0400

On Mon, Apr 22, 2019 at 02:25:44PM -0400, Mike Snitzer wrote:
I know it's possible to set up dm-cache to combine network-attached
block devices and local SSDs, but I'm having a hard time finding any
first-hand evidence of this being done anywhere -- so I'm wondering
if it's because there are reasons why this is a Bad Idea, or merely
because there aren't many reasons for folks to do that.

The reason why I'm trying to do it, in particular, is for
mirrors.kernel.org systems where we already rely on dm-cache to
combine large slow spinning disks with SSDs to a great advantage.
Most hits on those systems are to the same set of files (latest
distro package updates), so dm-cache hit-to-miss ratio is very
advantageous. However, we need to build newest iterations of those
systems, and being able to use network-attached storage at providers
like Packet with local SSD drives would remove the need for us to
purchase and host huge drive arrays.

Thanks for any insights you may offer.

Only thing that could present itself as a new challenge is the
reliability of the network-attached block devices (e.g. do network
outages compromise dm-cache's ability to function).

I expect them to be *reasonably* reliable, but of course the chances of 
network-attached block storage becoming unavailable are higher than for 
directly-attached storage.

I've not done any focused testing for, or thinking about, the impact
unreliable block devices might have on dm-cache (or dm-thinp, etc).
Usually we advise people to ensure the devices that they layer upon are
adequately robust/reliable.  Short of that you'll need to create your
own luck by engineering a solution that provides network storage
recovery.

I expect that in writethrough mode the worst kind of recovery we'd have 
to deal with is rebuilding the dm-cache setup, as even if the underlying 
slow storage becomes unavailable, that shouldn't result in FS corruption 
on it. Even though mirrors.kernel.org data is just that, mirrors, we 
certainly would like to avoid situations where we have to re-sync 40TB 
all over, as that usually means a week-long outage.

If the "origin" device is network-attached and proves unreliable you
can expect to see the dm-cache experience errors.  dm-cache is not
raid.  So if concerned about network outages you might want to (ab)use
dm-multipath's "queue_if_no_path" mode to queue IO for retry once the
network-based device is available again (dm-multipath isn't raid
either, but for your purposes you need some way to isolate potential for
network based faults).  Or do you think you might be able to RAID1 or
RAID5 N of these network attached drives together?

I don't think that makes sense, as these volumes would likely be coming 
from the same NAS array, so we'd be increasing complexity without 
necessarily hedging any risks.

Thanks for your help -- I think we're going to try this out as 
experimental setup and then see what kind of issue we run into.

Best,
-K

_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/