Hi All,
So, as I've mentioned here before, I run GFS2 on a two node mail
cluster, generally with good success. One issue which I am trying to
sort out is the backups. Currently we use rsync each night to create a
backup, and we're storing 60 days' worth that way, using "--link-dest"
to avoid creating 60 copies of each identical file.
This works well, but slowly (7 hours per night), and the backups have
quite a lot of performance impact on the production servers. Further,
it is my *suspicion* that the very large amount locking traffic
contributes to the fairly frequent "file stuck locked" issues which come
up several times per month, requiring a reboot.
Right now I'm in the process of migrating the cluster nodes to new
hardware, which means I've got an "extra" node capable of mounting GFS2
and being experimented with. Browsing the man pages turned up the
"spectator" mount option which seemed like exactly what I wanted -- the
ability to do a read only mount that doesn't interfere with the rest of
the cluster. To my surprise, it does indeed mount read-only but it
still generates a huge amount of locking traffic on the back end
network. Although this does keep our "production" nodes from
accumulating hundreds of thousands of locks, and thus perhaps improves
their reliability, I was hoping for more. Btw, "spectator" does not
work in conjunction with "lockproto=lock_nolock".
So next, I tried mounting with "ro,lockproto=lock_nolock" thinking that
it would give me a purely non-interfering mount. This failed for two
reasons. One, these startup messages scared me into thinking that the
"recovery" process might corrupt the filesystem. Apparently "ro"
doesn't quite mean "ro":
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=: Trying to join cluster
"lock_nolock", "mail_cluster:mail_fac"
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
Joined cluster. Now mounting FS...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=0, already locked for use
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=0: Looking at journal...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=0: Acquiring the transaction lock...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
recovery required on read-only filesystem.
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
write access will be enabled during recovery.
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=0: Replaying journal...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=0: Replayed 26 of 27 blocks
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=0: Found 1 revoke tags
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=0: Journal replayed in 1s
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=0: Done
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=1: Trying to acquire journal lock...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=1: Looking at journal...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=1: Acquiring the transaction lock...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
recovery required on read-only filesystem.
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
write access will be enabled during recovery.
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=1: Replaying journal...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=1: Replayed 28 of 34 blocks
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=1: Found 6 revoke tags
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=1: Journal replayed in 1s
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=1: Done
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=2: Trying to acquire journal lock...
Oct 1 18:34:12 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=2: Looking at journal...
Oct 1 18:34:13 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=2: Done
Oct 1 18:34:13 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=3: Trying to acquire journal lock...
Oct 1 18:34:13 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=3: Looking at journal...
Oct 1 18:34:13 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
jid=3: Done
The second reason it failed is that after a couple of hours, the mount
failed as follows:
Oct 1 19:19:24 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
fatal: invalid metadata block
Oct 1 19:19:24 post2-new kernel: GFS2:
fsid=mail_cluster:mail_fac.0: bh = 54432241 (magic number)
Oct 1 19:19:24 post2-new kernel: GFS2:
fsid=mail_cluster:mail_fac.0: function = gfs2_meta_indirect_buffer,
file = /builddir/build/B
UILD/gfs2-kmod-1.92/_kmod_build_/meta_io.c, line = 334
Oct 1 19:19:24 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
about to withdraw this file system
Oct 1 19:19:24 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
telling LM to withdraw
Oct 1 19:19:24 post2-new kernel: GFS2: fsid=mail_cluster:mail_fac.0:
withdrawn
(I have the call trace as well, if anybody's interested.) Thinking
about this, it seems clear that the failure occurred because some other
node changed things while my poor, confused read-only & no locking node
was reading them. This makes sense.
So I'm wondering two things:
1. What does spectator mode do exactly? Is it just the same as
specifying "ro" or are there other optimizations?
2. Would it be possible to have a mount mode that's strictly read-only,
no locking, and incorporates tolerance for errors? After all, I'm
backing up Maildirs (a few million individual files) every night. If I
miss a few messages one night, it's unlike to matter. So if we could
return an i/o error for a particular file without withdrawing from the
cluster, that would be wonderful. Better yet, why not purge the cached
data relating to this particular file and read it from disk again. Most
likely, that'll fetch valid data and the file will be accessible again.
Thanks in advance for any thoughts that you might have!
Allen
--
Allen Belletti
allen@xxxxxxxxxxxxxxx 404-894-6221 Phone
Industrial and Systems Engineering 404-385-2988 Fax
Georgia Institute of Technology
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster