On Wed, Apr 20, 2011 at 09:42:48PM -0700, Sage Weil wrote: > I think the lfn_open/_get type interface below all of the operation > methods will allow all of those things. I think it'll be simpler to push > as much of the filename rendering into that layer as possible, though > (possibly including the sobject_t mangling). Having all the > mangling/rendering done in one place will also make it easy to extend > without making multiple passes in different layers... So apparently, this email confused many. Let's try to clarify, to the best of my understanding: - Yehuda's code would currently write this: ./rados mkpool kitties ./rados create --pool=foo "longcatisl$(python -c 'print 300*"o"')ng" >>> FILENAME_SHORT_LEN=16 >>> FILENAME_HASH_LEN=3 >>> FILENAME_COOKIE="cLFN" >>> FILENAME_PREFIX_LEN = (FILENAME_SHORT_LEN - FILENAME_HASH_LEN - 1 - (len(FILENAME_COOKIE) - 1) - 1) >>> orig = "longcatisl"+300*"o"+"ng" >>> storable = orig + "_head" # plus backslash escaping, but let's ignore that now >>> munged = orig[:FILENAME_PREFIX_LEN] + "_" + FILENAME_COOKIE + "_%s_%d" % ("zzz", 42) >>> munged 'longcati_cLFN_zzz_42' So it loses the _head suffix, and all such things. - Sage wanted (as far as I understand) to have the munging be where the backslash escaping is, so the end result would look like 'longcati_cLFN_zzz_42_head'. Note the suffix. I hope I got that right. As for Yehuda's approach, I'm not very happy to see layers upon layers of rewriting the filenames.. It just seems more brittle. So Sage's version looks nicer to me, can do all the work it needs to do in a single pass, and lets us see the _head etc suffixes without reading xattrs, which might be useful for fsck-style things. As for both, I'm especially not fond of the very limited hashing, and the loops that keep calling build_filename. I fear collisions and races. Bumping up the hash size significantly will help the common case, but I still fear the races. I also think that Ceph, and especially the RGW bits, needs to be written to be fairly robust against DoS attacks. Nasty things happen out there, and having somebody able to trigger a "slow mode" on your server with fairly cheap operations is bad. Here's a concrete proposal: split the filename into subdirs if needed, and map the names 1:1, just to avoid the unpredictability of the above approach. And to get significantly less code and branching in the fast path. That is, I think I'd go for something like (Python written in C style to make it more direct to translate): # how much overhead to reserve in filenames to always have # prefix/suffix not split by slashes LONGEST_PREFIX_SUFFIX_LEN = len("_head") SAFE_FILENAME_LEN = 255 - LONGEST_PREFIX_SUFFIX_LEN def munge(path): dirprefix = None while len(path) > SAFE_FILENAME_LEN: head = path[:SAFE_FILENAME_LEN] if dirprefix is None: dirprefix = head else: dirprefix = dirprefix + '/' + head path = path[SAFE_FILENAME_LEN:] return dirprefix + '/' + path and now >>> munged = munge(orig) >>> munged 'longcatisloooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo/oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong' >>> munged + '_head' 'longcatisloooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo/oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong_head' And the caller would just mkdir all the leading dirs (ignore EEXIST errors). They can either be left around, or with a small loop handling a race between mkdir & rmdir, can be cleaned up either on unlink or in fsck/scrub/etc. Also, I'm not thrilled to have something this core, *and* being string manipulation in C, go without unit tests that exercise the corner cases. Finally, here's some misc notes on the existing code that are probably obvious, I just wanted to make sure: - hash is always "zzz" - xattr user.ceph._lfn conflicts with actual end-user xattrs "_lfn"? - escaping can lengthen the filename, does it handle that (I guess yes because this is a layer after that, but I can't tell without reading a lot of code) -- :(){ :|:&};: -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html