Re: Possible bug in thin metadata size with Linux MDRAID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/03/2017 12:53, Zdenek Kabelac wrote:

Hmm - it would be interesting to see your 'metadata' -  it should be still
quite good fit 128M of metadata for 512G  when you are not using snapshots.

What's been your actual test scenario ?? (Lots of LVs??)


Nothing unusual - I had a single thinvol with an XFS filesystem used to store an HDD image gathered using ddrescue.

Anyway, are you sure that a 128 MB metadata volume is "quite good" for a 512GB volume with 128 KB chunks? My testing suggests something different. For example, give it a look at this empty thinpool/thinvol:

[root@gdanti-laptop test]# lvs -a -o +chunk_size
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Chunk [lvol0_pmspare] vg_kvm ewi------- 128.00m 0 thinpool vg_kvm twi-aotz-- 500.00g 0.00 0.81 128.00k [thinpool_tdata] vg_kvm Twi-ao---- 500.00g 0 [thinpool_tmeta] vg_kvm ewi-ao---- 128.00m 0 thinvol vg_kvm Vwi-a-tz-- 500.00g thinpool 0.00 0 root vg_system -wi-ao---- 50.00g 0 swap vg_system -wi-ao---- 3.75g 0

As you can see, as it is a empty volume, metadata is at only 0.81% Let write 5 GB (1% of thin data volume):

[root@gdanti-laptop test]# lvs -a -o +chunk_size
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Chunk [lvol0_pmspare] vg_kvm ewi------- 128.00m 0 thinpool vg_kvm twi-aotz-- 500.00g 1.00 1.80 128.00k [thinpool_tdata] vg_kvm Twi-ao---- 500.00g 0 [thinpool_tmeta] vg_kvm ewi-ao---- 128.00m 0 thinvol vg_kvm Vwi-a-tz-- 500.00g thinpool 1.00 0 root vg_system -wi-ao---- 50.00g 0 swap vg_system -wi-ao---- 3.75g 0

Metadata grown by the same 1%. Accounting for the initial 0.81 utilization, this means that a near full data volume (with *no* overprovisionig nor snapshots) will exhaust its metadata *before* really becoming 100% full.

While I can absolutely understand that this is expected behavior when using snapshots and/or overprovisioning, in this extremely simple case metadata should not be exhausted before data. In other words, the initial metadata creation process should be *at least* consider that a plain volume can be 100% full, and allocate according.

The interesting part is that when not using MD, all is working properly: metadata are about 2x their minimal value (as reported by thin_metadata_size), and this provide ample buffer for snapshotting/overprovisioning. When using MD, the bad iteration between RAID chunks and thin metadata chunks ends with a too small metadata volume.

This can become very bad. Give a look at what happens when creating a thin pool on a MD raid whose chunks are at 64 KB:

[root@gdanti-laptop test]# lvs -a -o +chunk_size
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Chunk [lvol0_pmspare] vg_kvm ewi------- 128.00m 0 thinpool vg_kvm twi-a-tz-- 500.00g 0.00 1.58 64.00k [thinpool_tdata] vg_kvm Twi-ao---- 500.00g 0 [thinpool_tmeta] vg_kvm ewi-ao---- 128.00m 0 root vg_system -wi-ao---- 50.00g 0 swap vg_system -wi-ao---- 3.75g 0

Thin metadata chunks are now at 64 KB - with the *same* 128 MB metadata volume size. Now metadata can only address ~50% of thin volume space.

But as said - there is no guarantee of the size to fit for any possible
use case - user  is supposed to understand what kind of technology he is
using,
and when he 'opt-out' from automatic resize - he needs to deploy his own
monitoring.

True, but this trivial case should really works without tuning/monitoring. In short, let fail gracefully on a simple case...

Otherwise you would have to simply always create 16G metadata LV if you
do not want to run out of metadata space.



Absolutely true. I've written this email to report a bug, indeed ;)
Thank you all for this outstanding work.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux