Re: Reserve space for specific thin logical volumes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/09/2017 13:01, Zdenek Kabelac wrote:
There is very good reason why thinLV is fast - when you work with thinLV -
you work only with data-set for single thin LV.

So you write to thinLV and either you modify existing exclusively owned chunk
or you duplicate and provision new one.   Single thinLV does not care about
other thin volume - this is very important to think about and it's important for reasonable performance and memory and cpu resources usage.

Sure, I grasp that.

I think you need to think 'wider'.

You do not need to use a single thin-pool - you can have numerous thin-pools,
and for each one you can maintain separate thresholds (for now in your own
scripting - but doable with today's  lvm2)

Why would you want to place 'critical' volume into the same pool
as some non-critical one ??

It's simply way easier to have critical volumes in different thin-pool
where you might not even use over-provisioning.

I need to take a step back: my main use for thinp is virtual machine backing store. Due to some limitation in libvirt and virt-manager, which basically do not recognize thin pools, I can not use multiple thin pools or volumes.

Rather, I had to use a single, big thin volumes with XFS on top.

Seems to me - everyone here looks for a solution where thin-pool is used till the very last chunk in thin-pool is allocated - then some magical AI step in,
decides smartly which  'other already allocated chunk' can be trashed
(possibly the one with minimal impact  :)) - and whole think will continue
run in full speed ;)

Sad/bad news here - it's not going to work this way....

No, I absolutely *do not want* thinp to automatically dallocate/trash some provisioned blocks. Rather, I all for something as "if free space is lower than 30%, disable new snapshot *creation*"

lvm2 also DOES protect you from creation of new thin-pool when the fullness
is about lvm.conf defined threshold - so nothing really new here...

Maybe I am missing something: this threshold is about new thin pools or new snapshots within a single pool? I was really speaking about the latter.

[root@blackhole ~]# zfs destroy tank/vol1@snap1
[root@blackhole ~]# dd if=/dev/zero of=/dev/zvol/tank/vol1 bs=1M count=500 oflag=direct
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 12.7038 s, 41.3 MB/s
[root@blackhole ~]# zfs list -t all
NAME        USED  AVAIL  REFER  MOUNTPOINT
tank        622M   258M    96K  /tank
tank/vol1   621M   378M   501M  -

# Snapshot creation now FAILS!

ZFS is filesystem.

So let's repeat again :) amount of problems inside a single filesystem is not comparable with block-device layer - it's entirely different world of problems.

You can't really expect filesystem 'smartness' on block-layer.

That's the reason why we can see all those developers boldly stepping into the 'dark waters' of mixed filesystem & block layers.

In the examples above, I did not use any ZFS filesystem layer. I used ZFS as volume manager, with the intent to place an XFS filesystem on top of ZVOL block volumes.

The ZFS man page clearly warns about ENOSP with sparse volume. My point is that, by cleaver using of the refreservation property, I can engineer a setup where snapshot are generally allowed, unless free space is under a certain threshold. In this case, the are not allowed (but newer automatically deleted!).

lvm2/dm trusts in different concept - it's possibly less efficient,
but possibly way more secure - where you have different layers,
and each layer could be replaced and is maintained separately.

And I really trust layer separation - it is for this very reason I am a big fan of thinp, but its fail behavior somewhat scares me.

ATM thin-pool cannot somehow auto-magically 'drop'  snapshots on its own.

Let me repeat: I do *not* want thinp to automatically drop anything. I simply what it to disallow new snapshot/volume creation when unallocated space is too low

And that's the reason why we have those monitoring features provided with dmeventd. Where you monitor occupancy of thin-pool and when the
fullness goes above defined threshold  - some 'action' needs to happen.

And I really thank you for that - this is a big step forward.
AFAIK current kernel (4.13) with thinp & ext4 used with remount-ro on error and lvm2 is safe to use in case of emergency - so surely you can lose some uncommited data but after reboot and some extra free space made in thin-pool you should have consistent filesystem without any damage after fsck.

There are not known simple bugs in this case - like system crashing on dm related OOPS (like Xen seems to suggest... - we need to see his bug report...)

However - when thin-pool gets full - the reboot and filesystem check is basically mandatory - there is no support (and no plan to start support randomly dropping allocated chunks from other thin-volumes to make space for your running one)


I'd like to still see what you think is  'deadly'

Committed (fsynced) writes are safe, and this is very good. However, *many* application do not properly issue fsync(); this is a fact of life.

I absolutely *do not expect* thinp to automatically cope well with this applications - I full understand & agree that application *must* issue proper fsyncs.

However, recognizing that real world is quite different from my ideals, I want to exclude how many problems are possible: for this reason, I really want to prevent full thin pools even in the face of failed monitoring (or somnolent sysadmins).

In the past, I testified that XFS take its relatively long time to recognize that a thin volume is unavailable - and many async writes can be lost in the process. Ext4 + data=journaled did a better job, but a) it is not the default filesystem in RH anymore and b) data=journaled is not the default option and has its share of problems.

Complex systems need to be monitored - true. And I do that; in fact, I have *two* monitor system in place (Zabbix and custom shell based one). However, being bitten from a failed Zabbix Agent in the past, I learn a good lesson: to design system where some types of problems can not simply happen.

So, if in the face of a near-full pool, thinp refuse me to create a new filesystem, I would be happy :)

And also I'd like to be explained what better thin-pool can do in terms
of block device layer.

Thinp is doing a great job, and nobody wants to deny that.

Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux