On Thu, 14 September 2006 15:55:45 -0600, Andreas Dilger wrote: > On Sep 14, 2006 23:02 +0200, J???rn Engel wrote: > > > > Using your scheme (slightly reduced) we now have: > > size32 size64 funky? > > d_count 4 4 > > d_flags 4 4 > > d_lock 4 4_ y > > d_inode 4_ 8 > > d_hash 8 16-- > > d_parent 4 8_ > > d_name 12-- 16___ > > d_lru 8_ 16_ > > d_rcu/d_child 8 16__ > > d_subdirs 8___ 16_ > > d_alias 8 16____ > > d_time 4 8 > > d_op 4_ 8_ > > d_sb 4 8 > > d_fsdata 4 8__ > > d_cookie 0 0 y > > d_mounted 4 4 > > d_iname 36____ 36 > > > > With the two funky fields possibly growing, depending on kernel > > config. [_-] mark 16-, 32- 64- and 128-byte boundaries, depending on > > len. What really frightens me is that a 32-byte boundary goes right > > through d_name on 32bit machines. > > Actually, splitting d_name like this is not so bad (as long as the > compiler doesn't add padding) because the important fields (hash > and len) are first and are compared for all non-matching dentries > in __d_lookup(). Except that d_name is split between hash and len, not between len and name. You said d_lock was added later? Looks like it broke careful tuning for 32-byte cacheline machines. And it could also have caused the misalignment on 64bit. > > Now d_lookup() should use a single cacheline, even on my aged > > notebook, and the other hot fields remain at the top. d_mounted is > > also moved up to remove the misalignment on 64bit. Might be worth > > a benchmark or two to see whether it makes a difference... > > Might not be too hard (even if it temporarily kills performance) > to add atomic counters for each of these fields where they are > referenced in dcache.c, namei.c and e.g. fs/ext3/*.c (which is > only using d_inode, d_name, and d_parent). Run a find and a > kernel compile and dump the counters at shutdown. Might be even simpler by running gcov/lcov. That would not show which fields are used at similar times, but it is a start. Btw, I already mentioned how reducing the d_iname fields by a few bytes could save a cacheline. And whenever I have a good idea, Arnd usually comes up with a better one. How about the patch below to fix dentries to 128/192 bytes on 32/64 bit machines, independently of config options? It would shrink them from 132 bytes to 128 bytes in my current config (although I don't quite remember why I turned CONFIG_PROFILING on). Jörn -- Joern's library part 14: http://www.sandpile.org/ --- slab/include/linux/dcache.h~dentry_size 2006-09-14 22:19:51.000000000 +0200 +++ slab/include/linux/dcache.h 2006-09-15 12:25:27.000000000 +0200 @@ -77,7 +77,7 @@ full_name_hash(const unsigned char *name struct dcookie_struct; -#define DNAME_INLINE_LEN_MIN 36 +#define DNAME_INLINE_LEN_MIN 16 struct dentry { atomic_t d_count; @@ -112,7 +112,10 @@ struct dentry { #endif int d_mounted; unsigned char d_iname[DNAME_INLINE_LEN_MIN]; /* small names */ -}; +}__attribute__((aligned(64))); /* make sure the dentry is 128/192 bytes + on 32/64 bit independently of config + options. d_iname will vary in length + a bit. */ /* * dentry->d_lock spinlock nesting subclasses: - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html