On Fri, Nov 12, 2010 at 05:02:02PM +1100, Nick Piggin wrote: > On Thu, Nov 11, 2010 at 08:48:38PM -0800, Linus Torvalds wrote: > > On Thu, Nov 11, 2010 at 5:24 PM, Nick Piggin <npiggin@xxxxxxxxx> wrote: > > > > > > So this is really not a "oh, maybe someone will see 10-20% slowdown", or even > > > 1-2% slowdown. > > > > You ignored my bigger issue: the _normal_ way - and the better way - > > to handle these thingsis with SLAB_DESTROY_BY_RCU. > > Well I tried to answer that in the other threads. > > SLAB_DESTROY_BY_RCU is indeed quite natural for a lot of RCU usages, > because even with standard RCU you almost always have the pattern like > > rcu_read_lock(); > obj = lookup_data_structure(key); > if (obj) { > lock(obj); > verify_obj_in_structure(obj, key); > /* blah... (eg. take refcount) */ > } > > And in this pattern, SLAB_DESTROY_BY_RCU takes almost zero work. > > OK, but rcu-walk doesn't have that. In rcu-walk, we can't take a lock > or a reference on either the dentry _or_ the inode, because the whole > point is to reduce atomics (for single threaded performance), and > stores into shared cachelines along the path (for scalability). > > It gets really interesting when you have crazy stuff going on like > inode->i_ops changing from underneath you while you're trying to do > permission lookups, or inode type changing from link to dir to reg > in the middle of the traversal. > > > > So what are the advantages of using the inferior approach? I really > > don't see why you push the whole "free the damn things individually" > > approach. > > I'm not pushing _that_ aspect of it. I'm pushing the "don't go away and > come back as something else" aspect. > > Yes it may be _possible_ to do store-free walking SLAB_DESTROY_BY_RCU, > and I have some ideas. But it is hairy. More hairy than rcu-walk, by > quite a long shot. So in short, that is my justification. 12% is a worst case regression, but the demonstration is obviously absurdly worst case, and is merely there as a "ok, the sky won't fall on anybody's head" upper bound. In reality, it's likely to be well under 0.1% in any real workload, even an inode intensive one. So I much prefer to err on the side of less complexity, to start with. There just isn't much risk of regression AFAIKS, and much more risk of becoming unmaintainable too complex. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html