On Thu, Apr 21, 2011 at 2:54 PM, Yehuda Sadeh Weinraub <yehudasa@xxxxxxxxx> wrote: > On Thu, Apr 21, 2011 at 2:44 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >> On Thu, Apr 21, 2011 at 2:23 PM, Yehuda Sadeh Weinraub >> <yehudasa@xxxxxxxxx> wrote: >>> On Thu, Apr 21, 2011 at 2:09 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >>>> On Thu, Apr 21, 2011 at 1:03 PM, Gregory Farnum >>>> <gregory.farnum@xxxxxxxxxxxxx> wrote: >>>>> I really don't see how pushing the naming complexity into the local filesystem, >>>>> where it adds lots of otherwise-useless inodes and dentries, is going to help us. >>>> >>>> Here is a quick summary of how the TV's proposal would help us. >>>> 1. it avoids collisions entirely >>>> 2. You don't ever have do an extra xattr lookup, no matter how short >>>> or long the object name is. >>> >>> Yeah, but you read more directories. Note that btrfs stores the xattrs >>> on the directories, so reading those xattrs will have a lower IO >>> impact than traversing directories recursively. >> >> It does seem like btrfs' extended attribute implementation is fairly >> efficient. But Linux's dentry cache (dcache) is also pretty efficient. >> > (resending to list) > > It needs to be populated first before being efficient. And it'll be > less efficient now that you populate it with extra entries. That is a good point. However, xattrs also have a cost. It seems like btrfs sometimes creates an inode for xattrs, and sometimes just stashes them in the dentry (presumably if there aren't many and they're small?) The xattr-scheme always creates an extra xattr per entry. The directory-based scheme creates extra directories, but not that many, assuming a lot of objects have names with similar prefixes-- an assumption that is likely to be true nearly all the time. I think both schemes are doable, but I still lean towards the directory-based one, just because I like fast prefix search. Colin -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html