Re: add volatile flag to PV/LVs (for cache) to avoid degraded state on reboot

Zdenek Kabelac <zdenek.kabelac@xxxxxxxxx> · Thu, 18 Jan 2024 16:40:47 +0100

Dne 17. 01. 24 v 23:00 Gionatan Danti napsal(a):
Il 2024-01-17 12:08 Zdenek Kabelac ha scritto:
It's also not completely
true that even 'writethrough' cache cannot have dirty-blocks (aka -
only present in cache and origin had failed writes).

Hi, really? From dm-cache docs:

"If writethrough is selected then a write to a cached block will not
complete until it has hit both the origin and cache devices.  Clean
blocks should remain clean."

So I would not expect to see dirty blocks on write-through cache, unless the 
origin device is unable to write at all - which means that removing the cache 
device would be no worse that not having it at all in the first place.

What am I missing?

Cache can contain blocks that are still being 'synchronized' to the cache 
origin. So while the 'writing' process doesn't get ACK for writes - the cache
may have valid blocks that are 'dirty' in terms of being synchronized to 
origin device.

And while this is usually not a problem when system works properly,
it's getting into weird 'state machine' model when i.e. origin device has 
errors - which might be even 'transient' with all the variety of storage types 
and raid arrays with integrity and self-healing and so on...

So while it's usually not a problem for a laptop with 2 disks, the world is 
more complex...

But ATM we are not seeing it as some major trouble.  Hotspot cache is
simply not supposed to be randomly removed from your systems - as it
it's not easy to rebuild.

As a write-through cache should not contain dirty data, using a single SSD for 
caching should be OK. I think that if such expendable (and write-through) SSD 
fails, one should be able to boot without issues.

This is mostly true - yet the lvm2 should be 'available' in boot ramdisk and 
the booting process should be possibly able to recognize problem and call some 
sort of 'lvconvert --repair' and proceed with boot.

As mentioned - there could be seen some similarity with raid with failed leg 
- so some sort of 'degraded' activation might be also an option here.
But it further needs some lvm2 metadata update to maintain the 'state' of 
metadata - so if there is again some 'reboot' and PV with cache appears back - 
it will not interfere with the system (aka providing some historical cached 
blocks,  so just like mirrored leg needs some care...)

Regards

Zdenek