Re: [PATCH 4/4] xfs: convert buffer cache hash to rbtree

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 10 Sep 2010 13:22:00 +1000

On Wed, Sep 08, 2010 at 09:51:50PM -0400, Christoph Hellwig wrote:
> On Thu, Sep 09, 2010 at 01:12:58AM +1000, Dave Chinner wrote:
> > I have selected rbtrees for indexing becuse they can have O(log n)
> > search scalability, and insert and remove cost is not excessive,
> > even on large trees. Hence we should be able to cache large numbers
> > of buffers without incurring the excessive cache miss search
> > penalties that the hash is imposing on us.
> 
> Once thing that worries me about the rbtrees is that the Linux
> implementation doesn't allow for lockless readers.  But in the end the
> buffer cache implementation is very well encapsulated, so if the need
> arises we could easily change the underlying data structure.

Agreed. I'm going for simplicity of implementation first - list to
rbtree conversion is pretty trivial and realtively easy to verify.
We can revisit the choice of rbtrees later on if/when we need to.

> > +	/*
> > +	 * The buftarg cache should never be used by external devices.
> > +	 * Ensure we catch any users with extreme prejudice.
> > +	 */
> > +	btp->bt_mp = external ? NULL : mp;
> 
> I'd much prefer to always initialize this field.  We currently have a
> b_mount field struct xfs_buf which is used only in a few places
> and initialized rather, ehmm, lazily.  If we could replace it with
> ->b_target->bt_mount we can shrink struct buf and make the information
> available much more consistently.  Just adding the mount argument
> to the buftarg and removing it from the buf would be a nice little
> preparatory patch.

Good idea. I'll run up a patch to do that - if we've got more
buffers around, giving them a diet makes sense.

> And yes, I think bt_mount would be much nicer name than bt_mp.

Agreed. call me lazy ;)

> > @@ -210,8 +210,6 @@ xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t agno)
> >  	pag = radix_tree_lookup(&mp->m_perag_tree, agno);
> >  	if (pag) {
> >  		ASSERT(atomic_read(&pag->pag_ref) >= 0);
> > -		/* catch leaks in the positive direction during testing */
> > -		ASSERT(atomic_read(&pag->pag_ref) < 1000);
> 
> Di you manage to hit this during testing?  Either way it should probably
> be a separate patch.

Not with xfstests. Takes about 0.5s for fsmark to hit it, though. ;)
I'll put it in a separate patch, too.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs