Re: optimising DLM speed?

Steven Whitehouse <swhiteho@xxxxxxxxxx> · Thu, 17 Feb 2011 10:13:15 +0000

Hi,

On Wed, 2011-02-16 at 20:38 +0000, Alan Brown wrote:
> > For the GFS2 glocks, that doesn't matter - all of the glocks are held 
> in a single hash table no matter how many filesystems there are.
> 
> Given nearly 4 mlllion glocks currently on one of the boxes in a quiet 
> state (and nearly 6 million if everything was on one node), is the 
> existing hash table large enough?
> 
> 
It is a concern. The table cannot be realistically expanded forever, and
expanding it "on the fly" would be very tricky. There are however other
factors which determine the scalability of the hash table, not just the
number of hash heads. By using RCU for the upstream code, we've been
able to reduce locking and improve speed by a significant factor without
needing to increase the number of list heads in the hash table. We did
increase that number though, anyway, since the new system we are using
can put both the hash chain lock and the hash table head into a single
pointer. That means less space for locks and therefore we increased the
number of hash table heads at that time.

However large we grow the table though, it will never really be "enough"
so that probably the next development will be to have trees rather than
chains of glocks under each hash table head, and at least then the chain
lengths will scale with log(N) rather than N. The issue with doing that
is making such a thing work with RCU.

We do go to some lengths to avoid doing hash lookups at all. Once a
glock has been attached to an inode, we don't do any lookups in the hash
table again until the inode has been pushed out of the cache, so it will
only show up on a workload which is constantly scanning new inodes which
are not in cache already. At least until now, the time taken to do the
I/O associated with such operations has been much larger, so that it
didn't really show up as an important performance item.

Obviously if it causes problems, then we'll look into addressing them.
Hopefully that explains a bit more of our reasoning behind the decisions
that have been made. Please let us know if we can be of further help,

Steve.

> 
> --
> Linux-cluster mailing list
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster