Re: Handling EOBs in CloudFS

Edward Shishkin <edward@xxxxxxxxxx> · Thu, 14 Jul 2011 23:50:01 +0200

On 07/14/2011 11:24 PM, Devon Miller wrote:
How would you handle ->lseek()? Not so much of an issue for Approach 1,
but with interleaved HMACs...

Hello.

Offset in the interspaced file is

new_off = off + (off >> block_bits << hmac_bits);

So, I think this is not a big deal?

Thanks!
Edward.

On Thu, Jul 14, 2011 at 5:01 PM, Edward Shishkin <edward@xxxxxxxxxx
<mailto:edward@xxxxxxxxxx>> wrote:

    Hello everyone,
    any comments, suggestions are welcome..

           Handling EOBs (end-of-blocks) for transparent
        encryption, checking integrity and data authentication

                              DRAFT

    This was designed for CloudFS, which uses 2-level protocol (high and
    low) supported by xlators which reside on server and client sides
    respectively.

    Definition of EOB. Storage class

    If file size isn't a multiple of cblock (cipher block) size, then we
    also need to store special padding needed to decrypt its last block
    with some cipher modes like CBC. This padding contains a part of
    ciphertext and must be considered as a part of this file. We'll call
    this padding end-of-file (EOF). If plain text has size a multiple of
    cblock size, then encrypted file won't have (or will have empty) EOF.

    Signatures (HMACs, etc) for checking integrity, data authentication,
    etc. have the same nature as EOF. Every such signature is created
    for some logical block in a file. This is not a padding though, as
    in the case of EOF, but anyway such signatures are associated with
    file's data, and we'll consider a class of object, which includes
    EOFs, HMACs, etc, and call them EOBs (end-of-block).

    We define storage class of EOBs as "data", i.e. this can be considered
    as part of file's data: we can not read/write data block without
    reading/writing its EOB.

    Storing EOBs. Approaches and Issues

    Approach 1: Storing EOBs as xattr values.

    In this case we store a file in parts which are not adjacent
    from the standpoint of Cloudfs. That said we need to split
    read, and this makes this operation inatomic. This means
    that read(2) will return data compound of parts of different
    "versions".

    Example:

    Suppose we have a file F stored in 2 different parts F1 and F2.

    Process A writes a file F (to be of version 1);
    Process B reads a file F (part F1);
    Process C writes a file F (to be of version 2);
    Process B reads a file F (part F2);

    As the result process B returns data compound of
    parts of different versions 1 and 2.

    This non-atomicity is different from the non-atomicity that takes
    place in the kernel (local file systems): kernel guarantees
    that all PAGE_SIZE reads with PAGE_SIZE-aligned offsets are
    atomic (this is because reads and writes in kernel acquire
    page locks). Whereas, in our case we'll have that F2 doesn't
    necessarily have PAGE_SIZE-aligned offset.

    That said it can happen that we'll get complaints from users,
    who don't expect such non-atomicity. Moreover, in the case when
    EOBs are HMACs for checking integrity, or authentication we'll
    have false positives, as nobody guarantees that versions of HMAC
    and respective data block will coincide.

    Solution:

    In this approach we need to serialize truncates, appending
    writes and sequences RbRe (read block, read EOB).

    Approach 2: Storing in file's body.

    In this case EOBs are stored in file's body (via appending to
    a file in the case of EOF, or interspacing a file with HMACs,
    etc). So file with his EOBs is the whole from the standpoint
    of Cloudfs, and there is no problems with atomicity specific
    to Approach 1.

    However, in this case all our files maintained by low-level
    local fs will have increased sizes (added total size of all EOBs).
    So that actual file size must be stored as additional attribute
    (e.g. as xattr value).

    ->open() method of the high-level translator loads actual
    file size to the cloudfs-specific part of inode via fetching
    ->getxattr(), so that it is persistent in the memory on server.

    Any ->truncate() and appending ->write() of the high-level
    xlator update in-core and on-disk actual sizes simultaneously
    (via fetching ->setxattr() for the last one). This actual size
    is what should be returned to user by ->fstat(), ->lookup(),
    etc. as st_size.

    _________________________________________________
    Gluster-devel mailing list
    Gluster-devel@xxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxx>
    https://lists.nongnu.org/__mailman/listinfo/gluster-devel
    <https://lists.nongnu.org/mailman/listinfo/gluster-devel>