Re: Possible bug in expanding thinpool: lvextend doens't expand the top-level dm-linear device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dne 4.1.2016 v 06:08 M.H. Tsai napsal(a):
2016-01-03 7:05 GMT+08:00 Zdenek Kabelac <zkabelac@redhat.com>:
Dne 1.1.2016 v 19:10 M.H. Tsai napsal(a):
2016-01-01 5:25 GMT+08:00 Zdenek Kabelac <zkabelac@redhat.com>:
There is even sequencing problem with creating snapshot in kernel target
which needs to be probably fixed first.
(the rule here should be - to never create/allocate something when
there is suspended device

Excuse me, does the statement
'to never create/allocate something when there is suspended device'
describes the case that the thin-pool is full, and the volume is
'suspend with no flush' ? Because there's no free blocks for
allocation.

The reason for this is -  you could suspend a device with i.e. swap/root
so now - if during any kernel allocation kernel would need a memory
chunk and would require some 'swap/root' space on suspended disk, kernel
would block endlessly.

So table reload (with updated dm table line) should always happen before
suspend (aka PRELOAD phase in lvm2 code).

Following device resume should be just switching tables without any
memory allocations - those should have been all resolved in load phase -
where you have always 2 slots - active & inactive.

(And yes - there are some (known) problems with this rule in current lvm2 and some dm targets...)

Otherwise, it would be strange if we cannot do these operations when
the pool is not full.

Extension of device is 'special' - in fact we could enable 'suspend WITHOUT flush' for any 'lvextend' operation - but that needs full re-validation of all targets - so for now it's only enabled for thin-pool lvextend.

As 'suspend with flush' is typically needed when you change device type in some way - however with pure lvextend case (onlt new space is added, no existing device space changes) there may not be any BIO in-flight routed into 'new extended' space - thus flush is not needed. (unsure if this explanation does make sense)


and this rule is broken with current thin
snapshot creation, so thin snap create message should go in front
to ensure there is a space in thin-pool ahead of origin suspend  - will
be addressed in some future version....)

However when taking snapshot - only origin thin LV is now suspended and
should not influence rest of thin volumes (except for thin-pool commit
points)

Does that mean in future version of dm-thin, the command sequence of
snapshot creation will be:

dmsetup message /dev/mapper/pool 0 "create_snap 1 0"
dmsetup suspend /dev/mapper/thin
dmsetup resume /dev/mapper/thin

Possibly different message - since everything must remain
fully backward compatible (i.e. create_snap_on_suspend,
or maybe some other mechanism will be there).
But yes something in this direction...

I'm not well understood. Is the new message designed for the case that
thin-pool is nearly full?
Because the pool's free data blocks might not sufficient for 'suspend
with flush' (i.e., 'suspend with flush' might failed if the pool is
nearly full), so we should move the create_snap message before
suspending. However, the created snapshots are inconsistent.
If the pool is full, then there's no difference between taking
snapshots before or after 'suspend without flush'.
Is that right?

As said - the solution is nontrivial - and needs enhancements
on suspend API - when you suspend 'thinLV origin' you need
to use suspend with flush - however ATM such suspend may 'block'
whole lvm2 - while lvm2 keeps VG lock.

As a prevention - lvm2 user can configure threshold for autoresize (e.g. 70%)
and when pool is above the threshold user is not allowed to create any new thinLV. This normally works quite ok - but it's obviously not a 'bullet-proof' solution here (as you could construct a case, where time-of-check
and time-of-use may cause out-of-space pool).

So far the rule is simple - at all cost - do not run thin-pool when it's full, overfilled pool is NOT comparable to a 'single' write error.
When admin is solving overfilled pool - something went wrong earlier
(admin failed to extend his VG)....

Thin-pool is about 'promising' a space user can deliver 'later', not about
hitting overfull corner case as 'regular' use-case where user can expect some well handled error behavior (but yes we try to make a better user experience here)

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux