Re: [PATCH 00/14] GFS

Mark Fasheh <mark.fasheh@xxxxxxxxxx> · Wed, 3 Aug 2005 11:54:01 -0700



On Wed, Aug 03, 2005 at 12:37:44PM +0200, Lars Marowsky-Bree wrote:
> On 2005-08-03T11:56:18, David Teigland <teigland@xxxxxxxxxx> wrote:
> 
> > > * Why use your own journalling layer and not say ... jbd ?
> > Here's an analysis of three approaches to cluster-fs journaling and their
> > pros/cons (including using jbd):  http://tinyurl.com/7sbqq
> 
> Very instructive read, thanks for the link.
While it may be true that for a full log, flushing for a *single* lock may
be more expensive in OCFS2, Ken ignores the fact that in our one big flush
we've made all locks on journalled resources immediately releasable.
According to that description, GFS2 would have to do a seperate transaction
flush (including the extra step of writing revoke records) for each lock
protecting a journalled resource. Assuming the same number of locks are
required to be dropped under both systems then for a number of locks > 1
OCFS2 will actually do less work - the actual metadata blocks would be the
same on either end, but JBD only has to write that the journal is now clean
to the journal superblock whereas GFS2 has to revoke the blocks for each
dropped lock.

Of course all of this talk completely avoids the fact that in any case these
things are expensive so a cluster file system has to take care to ping locks
as little as possible. OCFS2 takes great pains to make as many operations
node local (requiring no cluster locks) as possible - data allocation is
usually done from a node local pool which is refreshed from the main bitmap.
Deallocation happens similarly - we have a truncate log in which we record
deleted clusters. Each node has their own inode and metadata chain
allocators which another node will only lock for delete (a truncate log
style local metadata delete log could easily be added if that ever became a
problem).
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@xxxxxxxxxx

--

Linux-cluster@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/linux-cluster