reiser4: discard support

Ivan Shapovalov <intelfx100@xxxxxxxxx> · Wed, 30 Apr 2014 11:43:27 +0400

Hi Edward,

So I bit the bullet, ditched all tasks for the upcoming holidays and gave 
another try to the discard support in reiser4. Since my last status update, 
I've lost the local kernel tree due to bad scripting (decided to TRIM free 
space between partitions... and TRIM'ed everything that is NOT free space), so 
I had to start from scratch.

I decided to go with following algorithm:

During the transaction deallocated ranges are logged as-is to a list or array 
(let's call it "the delete log"). On atom commit we will generate a minimal 
superset of the delete log, comprised only of whole erase units ("the discard 
set").
So, at commit time the following actions take place:
- elements of the delete log are sorted;
- the delete log is iterated, merging any adjacent ranges;
- for each resulting range in the delete log its start is rounded down to the
  closest erase unit boundary, and, starting from this block, extents of erase
  unit length are created until whole range is covered;
- for each such new extent it is checked whether does it contain allocated
  blocks. If no (i. e. the range is completely deallocated), the range is
  added to the discard set.

Complexity of logging is thus O(n), added complexity of atom joining is O(1) 
and added complexity of atom commit is O(n*log(n) + n).

The first question is how to store the delete log.
- blocknr_set isn't ordered per se, so adjacent entries can't be easily merged
- simple linked list (extent per node) seems like too big overhead
- kmalloc()'ed arrays can't be joined in O(1) and they are static (imposes a 
hard limit on log size)
- kmem_cache... feels like using a sledge-hammer.

And oh, where in commit_current_atom() should these things be done? I guess 
right after the reiser4_write_logs(), under the tmgr mutex, but I can't really 
explain why I think so.

Thanks,
-- 
Ivan Shapovalov / intelfx /
Attachment:
signature.asc

Description: This is a digitally signed message part.