Re: long object names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 21, 2011 at 3:00 PM, Tommi Virtanen
<tommi.virtanen@xxxxxxxxxxxxx> wrote:
> On Thu, Apr 21, 2011 at 01:03:57PM -0700, Gregory Farnum wrote:
>> I like what Yehuda has here for its relative simplicity
>
> It's far from simple.
>
> Let's look at the unlink path:
>
>
> static int lfn_unlink(const char *pathname)
> {
>  const char *filename;
>  char short_fn[PATH_MAX];
>  char short_fn2[PATH_MAX];
>  int r, i, exist, err;
>  int path_len;
>  int is_lfn;
>
> ** helper function to split the path to dir and file, figure out a
> ** short name for this longname, count the lenght of the directory
> ** part of the path and other things; loops through the candidates,
> ** comparing against the xattr
>  r = lfn_get(pathname, short_fn, sizeof(short_fn), &filename, &exist, &is_lfn);
>  if (r < 0)
>    return r;
> ** if the filename  wasn't actually too long, take the easy way out
>  if (!is_lfn)
>    return unlink(pathname);
>  if (!exist) {
>    errno = ENOENT;
>    return -1;
>  }
>
> ** actual file unlink here
>  err = unlink(short_fn);
>  if (err < 0)
>    return err;
>
> ** and then, rename all the collisions, one by one, because they have
> ** a sequential number in them!
>  path_len = filename - pathname;
>  memcpy(short_fn2, pathname, path_len);
>
> ** this loop finds the highest sequential number in this hash
> ** collision bucket, saves it in i
>  for (i = r + 1; ; i++) {
>    struct stat buf;
>    int ret;
>
>    build_filename(&short_fn2[path_len], sizeof(short_fn2) - path_len, filename, i);
>    ret = stat(short_fn2, &buf);
>    if (ret < 0) {
>      if (i == r + 1)
>        return 0;
>
>      break;
>    }
>  }
>
> ** and then the highest seq number munged filename gets renamed to
> ** fill the gap we left behind
>  build_filename(&short_fn2[path_len], sizeof(short_fn2) - path_len, filename, i - 1);
>  generic_dout(0) << "renaming " << short_fn2 << " -> " << short_fn << dendl;
>
>  if (rename(short_fn2, short_fn) < 0) {
>    generic_derr << "ERROR: could not rename " << short_fn2 << " -> " << short_fn << dendl;
>    assert(0);
>  }
>
>  return 0;
> }

This is a work in progress, a proper locking is required and will be applied.

>
>
> Now, imagine a colliding file create between the stat and the rename
> -> boom. This is not the only race in there.
>
Yeah, we're well aware of those races. Note that splitting to
subdirectories is racey too. Imagine one thread/process creating an
object, while the other one removing a similar object with the same
prefix. The first one tries to create a subtree, while the other is
trying to remove the same subtree. I've seen these issues before,
they're real.
The chances of hitting these issues with none hashed structure is much
greater than the chances of hitting those races when the appropriate
hash algorithm is being used (the 'zzz' hash is just a filler).

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux