[DOC] Reiser4 Hybrid Transaction Model

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

After numerous requests I have documented the transaction model which
works in Reiser4 by default.


               Reiser4 Hybrid Transaction Model


Reiser4 transaction model is a high-level block allocator. User can
specify (by mount option) any of 3 currently existing transaction
models. First and second ones implement classic journaling and
write-anywhere (Copy-on-Write) transaction models. They are
implemented as a hard-coded models in various file systems. Here
we describe a unique Hybrid Transaction Model suggested by
Joshua Macdonald and Hans Reiser in 2001.


         I. Relocation Decision (relocate or overwrite)


In this transaction model a part of atom's blocks is scheduled to be
relocated (RELOCATE_SET). Another part - to be overwritten
(OVERWRITE_SET). All system blocks (super-block, bitmap blocks, etc)
for obvious reasons always fall to OVERWRITE set. All new nodes (which
haven't had block numbers yet) always fall to RELOCATE_SET. For other
blocks of the atom the relocation decision is based on the statistics
accumulated by a special scanning procedure. A positive decision
(relocate) is made in the case if enough (FLUSH_RELOCATE_THRESHOLD)
nodes were found for flushing (see comments and source code in flush.c
for details).


           II. Block allocation policy for RELOCATE_SET


For the RELOCATE_SET the high level block allocator first allocates
extents of free blocks. Moreover, the block allocator follows the
parent-first order on the RELOCATE_SET considered as tree.

Definition. Parent-first order on a tree is a linear order on tree
nodes determined by the following recursive procedure:

void parent_first(node)
{
    print_node (node);
    if (node->level > leaf) {
        for (i = 0; i < num_children; i += 1)
            parent_first (node->child[i]);
    }
}

Specifically, (node1 < node2) means that the procedure above prints
node1 before node2.

Comment. The definition above assumes that all tree nodes on the same
level were ordered by some linear manner (by the array child[i]).
In our case the mentioned order is determined by their natural
enumeration "from left to right" in the tree.

Reiser4 hybrid transaction model keeps parent-first order on the
RELOCATE_SET in terms of block numbers (disk addresses). It means
that the following implication is true:

    (node1 < node2) => (block_nr(node1) < block_nr(node2))

"<" in the second part of implication means usual linear order on the
set of disk addresses (block numbers).

It is achieved by finding and setting a "preceder" for the
RELOCATE_SET at the very beginning of its relocation. Preceder is the
maximal block number (disk address) with the following properties:

1) it is marked "busy" in the space map;
2) it is smaller than all old block numbers of the given RELOCATE_SET.

Further, during allocation of new block numbers for RELOCATE_SET the
variable containing the preceder gets updated and is passed as a
"hint" to the low-level space allocator.
--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux File System Development]     [Linux BTRFS]     [Linux NFS]     [Linux Filesystems]     [Ext4 Filesystem]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Resources]

  Powered by Linux