Jan, 2015-12-14 22:14 GMT+01:00 Jan Kara <jack@xxxxxxx>: >> (1) Many files with the same xattrs: Right now, an xattr block can be >> shared among at most EXT[24]_XATTR_REFCOUNT_MAX = 2^10 inodes. If 2^20 > > Do you know why there's this limit BTW? The on-disk format can support upto > 2^32 references... the idea behind that is to limit the damage that a single bad block can cause. >> inodes are cached, they will have at least 2^10 xattr blocks, all of >> which will end up in the same hash chain. An xattr block should be >> removed from the mbcache once it has reached its maximum refcount, but >> if I haven't overlooked something, this doesn't happen right now. >> Fixing that should be relatively easy. > > Yeah, that sounds like a good optimization. I'll try that. > >> (2) Very many files with unique xattrs. We might be able to come up >> with a reasonable heuristic or tweaking knob for detecting this case; >> if not, we could at least use a resizable hash table to keep the hash >> chains reasonably short. > > So far we limit number of entries in the cache which keeps hash chains > short as well. Using resizable hash table and letting the system balance > number of cached entries just by shrinker is certainly possible however I'm > not sure whether the complexity is really worth it. > > Regarding detection of unique xattrs: We could certainly detect trashing > of mbcache relatively easily. The difficult part if how to detect when to > enable it again because the workload can change. I'm thinking about some > backoff mechanism like caching only each k-th entry asked to be inserted > (starting with k = 1) and doubling k if we don't reach some low-watermark > cache hit ratio in some number of cache lookups, reducing k to half if > we reach high-watermark cache hit ratio. Such a heuristic would probably start in the same state after each reboot, so frequent reboots would lead to bad performance. Something as dumb as a configurable list of unsharable xattr names would allow to tune things without such problems and without adding much complexity. No matter what we end up doing here, mostly-unique xattrs on separate blocks will always lead to bad performance compared to in-inode xattrs. Some wasted memory for the mbcache is not the main problem here. Thanks, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html