Hello Andreas, On Thu, 14 September 2006 12:33:25 -0600, Andreas Dilger wrote: > > I think it makes sense to keep d_inode in the first part of the dentry > always, because it is by far the most referenced field in the dentry, > along with the critical fields from prune_dcache(), shrink_dcache_anon(), > dget(), dput(), d_lookup(). d_inode is definitely one of the hotter fields in there. It just happens to cause the misalignment. Bah, I don't see a good solution. > While not totally accurate in terms of runtime frequency of use, the counts > in the code: > > fs/*.[ch] fs/*/*.[ch] size32 size64 prune_dc shrk_dc_anon d_lookup > d_inode 384 2131 4 8 > d_lock 104 529 4 4 1 2 > d_count 18 66 4 4 1 2 > d_lru 18 18 4_ 8 1 1 > d_hash 37 154 4 8_ 2 1 > d_name 73 908 12_ 16 1 > d_flags 26 104 4 4 2 > d_mounted 7 7 4 4 > d_parent 40 231 4 8_ 2 > d_op 37 269 4 8 > d_rcu/d_child 3+22 3+45 8 16 [ d_hash is 8/16, actually ] d_hash, d_name and d_parent belong way up to the top of the list, imo. d_lookup() should be the hottest function of all, as the comment in the structure definition already indicates. Maybe the solution is to rearrange the fields with those going to the top? Using your scheme (slightly reduced) we now have: size32 size64 funky? d_count 4 4 d_flags 4 4 d_lock 4 4_ y d_inode 4_ 8 d_hash 8 16-- d_parent 4 8_ d_name 12-- 16___ d_lru 8_ 16_ d_rcu/d_child 8 16__ d_subdirs 8___ 16_ d_alias 8 16____ d_time 4 8 d_op 4_ 8_ d_sb 4 8 d_fsdata 4 8__ d_cookie 0 0 y d_mounted 4 4 d_iname 36____ 36 With the two funky fields possibly growing, depending on kernel config. [_-] mark 16-, 32- 64- and 128-byte boundaries, depending on len. What really frightens me is that a 32-byte boundary goes right through d_name on 32bit machines. Iirc, my PIII has 32-byte cachelines. Not good. How about moving [d_hash,d_parent,d_name] to the front? Something like: size32 size64 funky? d_hash 8 16_ d_parent 4 8 d_name 12- 16-- d_inode 4 8_ d_count 4__ 4 d_flags 4 4 d_lock 4 4 y d_mounted 4 4 d_lru 8 16 d_rcu/d_child 8 16 d_subdirs 8 16 d_alias 8 16 d_time 4 8 d_op 4 8 d_sb 4 8 d_fsdata 4 8 d_cookie 0 0 y d_iname 36 36 Now d_lookup() should use a single cacheline, even on my aged notebook, and the other hot fields remain at the top. d_mounted is also moved up to remove the misalignment on 64bit. Might be worth a benchmark or two to see whether it makes a difference... Jörn -- Joern's library part 1: http://lwn.net/Articles/2.6-kernel-api/ - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html