"mismatching layouts" errors after expanding volume

d.a.bretherton at reading.ac.uk (Dan Bretherton) · Wed, 22 Feb 2012 02:52:08 +0000

Dear All-
There are a lot of the following type of errors in my client and NFS 
logs following a recent volume expansion.

[2012-02-16 22:59:42.504907] I 
[dht-layout.c:682:dht_layout_dir_mismatch] 0-atmos-dht: subvol: 
atmos-replicate-0; inode layout - 0 - 0; disk layout - 9203501
34 - 1227133511
[2012-02-16 22:59:42.534399] I [dht-common.c:524:dht_revalidate_cbk] 
0-atmos-dht: mismatching layouts for /users/rle/TRACKTEMP/TRACKS
[2012-02-16 22:59:42.534521] I 
[dht-layout.c:682:dht_layout_dir_mismatch] 0-atmos-dht: subvol: 
atmos-replicate-1; inode layout - 0 - 0; disk layout - 1227133
512 - 1533916889

I have expanded the volume successfully many times in the past.  I can 
think of several possible reasons why this one might have gone wrong, 
but without expert advice I am just guessing.

1) I did precautionary ext4 filesystem checks on all the bricks and 
found errors on some of them, mostly things like this:

Pass 1: Checking inodes, blocks, and sizes
Inode 104386076, i_blocks is 3317792, should be 3317800.  Fix? yes

2) I always use hostname.domain for new GlusterFS servers when doing 
"gluster peer probe HOSTNAME" (e.g. gluster peer probe 
bdan14.nerc-essc.ac.uk).  I normally use hostname.domain (e.g. 
bdan14.nerc-essc.ac.uk) when creating volumes or adding bricks as well, 
but for the last brick I added I just used the hostname (bdan14).  I can 
do "ping bdan14" from all the servers and clients, and the only access 
to the volume from outside my subnetwork is via NFS.

3) I found some old GlusterFS client processes still running, probably 
left over from previous occasions when the volume was auto-mounted.  I 
have seen this before and I don't know why it happens, but normally I 
just kill unwanted glusterfs processes without affecting the mount.

4) I recently started using more than one server to export the volume 
via NFS in order to spread the load. In other words, two NFS clients may 
mount the same volume exported from two different servers. I don't 
remember reading anywhere that this is not allowed, but as this is a 
recent change I thought it would be worth checking.

5) I normally let people carry on using a volume while a fix-layout 
process is going on in the background.  I don't remember reading that 
this is not allowed but I thought it worth checking.  I don't do 
migrate-data after fix-layout because it doesn't work on my cluster.  
Normally the fix-layout completes without error and no "mismatching 
layout" errors are observed.  However the volume is now so large that 
fix-layout usually takes several days to complete, and that means that a 
lot more files are created and modified during fix-layout than before.  
Could the continued use of the volume during the lengthy fix-layout be 
causing the layout errors?

I have run fix-layout 3 times now and the second attempt crashed.  All I 
can think of doing is to try again now that several back-end filesystems 
have been repaired.  Could any of the above factors have caused the 
layout errors, and can anyone suggest a better way to remove them?   All 
comments and suggestions would be much appreciated.

Regards
Dan.