Re: lvm2 deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dne 05. 06. 24 v 10:59 Jaco Kroon napsal(a):
Hi,

On 2024/06/04 18:07, Zdenek Kabelac wrote:
Dne 04. 06. 24 v 13:52 Jaco Kroon napsal(a):
Last but not least -  disk scheduling policies also do have impact - to i.e. ensure better fairness - at the prices of lower throughput...
We normally use mq-deadline, in this setup I notice this has been updated to "none", the plan was to revert, this was done in collaboration with a discussion with Bart van Assche.  Happy to revert this to be honest. https://lore.kernel.org/all/07d8b189-9379-560b-3291-3feb66d98e5c@xxxxxxx/ relates.

Hi

So I guess we can tell the store like this -

When you've created your 'snapshot' of a thin-volume - this enforces full flush (& fsfreeze) of a thin volume - so any dirty pages need to written in thin pool before snapshot could be taken (and thin pool should not run out of space) - this CAN potentially hold your system running for a long time (depending on performance of your storage) and may cause various lock-ups states of your system if you are using this 'snapshoted' volume for anything else - as the volume is suspended - so it blocks further operations on this device - eventually causing full system circular deadlock (catch 22) - this is hard to analyze without whole picture of the system.

We may eventually think whether we can somehow minimize the amount of holding
vglock and suspending with flush & fsfreeze - but it's about some future possible enhancement and flush disk upfront to minimize dirty size.

For now reducing dirty page queue to minize the blocking time associated with snapshoting is a right choice (although 500M is probably unnecessarily low...)


Regards

Zdenek





[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux