Re: [PATCH] reiser4: precise discard - general case

Ivan Shapovalov <intelfx100@xxxxxxxxx> · Tue, 10 Feb 2015 23:42:24 +0300

On 2014-12-20 at 21:24 +0100, Edward Shishkin wrote:
> This is the promised generalization, which is supposed to work for all 
> discard
> offsets and all discard unit sizes without any restrictions.
> 
> Complications in comparison with the previous implementation:
> 
> In this general case we need "precise" coordinates, where every 
> individual byte
> can be addressed. All local variables, which represent precise 
> coordinates are
> denoted with "prefixes" (a_len, d_off, p_tailp, etc). Local variables, 
> which represent
> "non-precise" coordinates (they are usually of type reiser4_block_nr) 
> are denoted
> without prefixes (start, len, end, tailp, etc).
> 
> Blocks, which contain head and tail paddings are now calculated using the
> function size_in_blocks(), which actually is an expression for the 
> minimal number
> of blocks containing the precise extent.
> 
> The next trouble is "peculiarity in 0", encountered when calculating the 
> blocks of
> head padding. if discard offset is different from 0, then the first 
> discard unit of the
> partition is partial (its other part doesn't belong to our partition, so 
> we can not
> discard it). We handle this peculiarity by an additional check.
> 
> In other bits everything is the same.
> 
> Possible optimization: If discard unit sizes are always powers of 2, 
> then it makes
> sense to replace "do_div(offset, unit_size)" with "offset & (unit_size - 
> 1)".
> 
> Mount options discard.offset=xxx,discard.unit=yyy are to emulate various
> discard unit sizes and offsets on devices _without_ trim support (e.g. 
> HDDs).
> This is only for debugging purposes, don't use it for real SSD devices: 
> the kernel
> retrieves the discard parameters on its own.
> 
> This patch is against the patch series of Ivan Shapovalov:
> http://marc.info/?l=reiserfs-devel&m=141841865432082&w=2
> 
> Current status: not well-tested.
> 
> Edward.

Hi,

I've found a bug in our implementation (don't know when it appeared,
maybe it was quite some time ago). I've intended to fix it and send
a patch along with description, but I still can't think of a viable fix.

So: the problem is that check_free_blocks() isn't idempotent, because it
allocates blocks if the whole extent is clean. Therefore, it must not be
called for overlapping ranges. However, in some conditions tail padding
of some extent and head padding of next extent may overlap in terms of
disk blocks (gluing code only catches overlapping erase units).

This will yield a false negative when checking the head padding, so it
does not lead to any data losses (just to inefficiency).

Comments appreciated...

Thanks,
-- 
Ivan Shapovalov / intelfx /
Attachment:
signature.asc

Description: This is a digitally signed message part