"mismatching layouts" errors after expanding volume

jdarcy at redhat.com (Jeff Darcy) · Wed, 22 Feb 2012 09:19:48 -0500

On 02/22/2012 07:22 AM, Dan Bretherton wrote:
> I would really appreciate a quick Yes/No answer to the most important 
> question - is it safe to create, modify and delete files in a volume 
> during a fix-layout operation after an expansion?

My first reaction is to say yes, that should generally be safe.  Fix-layout
changes some xattrs on directories that control where new files are placed, but
even if a file ends up on the "wrong" brick the algorithm to find it anyway is
pretty robust.  It is possible to see layout anomalies if a client looks up a
directory at the exact moment that its layout xattrs are being updated, but
that should be a very rare and transient case.

OTOH, the log entries below do seem to indicate that there's something going on
that I don't understand.  I'll dig a bit, and let you know if I find anything
to change my mind wrt the safety of restoring write access.

> 
> The users are champing at the bit waiting for me to let them have write 
> access, but fix-layout is likely to take several days based on previous 
> experience.
> 
> -Dan
> 
> On 02/22/2012 02:52 AM, Dan Bretherton wrote:
>> Dear All-
>> There are a lot of the following type of errors in my client and NFS 
>> logs following a recent volume expansion.
>>
>> [2012-02-16 22:59:42.504907] I 
>> [dht-layout.c:682:dht_layout_dir_mismatch] 0-atmos-dht: subvol: 
>> atmos-replicate-0; inode layout - 0 - 0; disk layout - 9203501
>> 34 - 1227133511
>> [2012-02-16 22:59:42.534399] I [dht-common.c:524:dht_revalidate_cbk] 
>> 0-atmos-dht: mismatching layouts for /users/rle/TRACKTEMP/TRACKS
>> [2012-02-16 22:59:42.534521] I 
>> [dht-layout.c:682:dht_layout_dir_mismatch] 0-atmos-dht: subvol: 
>> atmos-replicate-1; inode layout - 0 - 0; disk layout - 1227133
>> 512 - 1533916889
>>
>> I have expanded the volume successfully many times in the past.  I can 
>> think of several possible reasons why this one might have gone wrong, 
>> but without expert advice I am just guessing.
>>
>> 1) I did precautionary ext4 filesystem checks on all the bricks and 
>> found errors on some of them, mostly things like this:
>>
>> Pass 1: Checking inodes, blocks, and sizes
>> Inode 104386076, i_blocks is 3317792, should be 3317800.  Fix? yes
>>
>> 2) I always use hostname.domain for new GlusterFS servers when doing 
>> "gluster peer probe HOSTNAME" (e.g. gluster peer probe 
>> bdan14.nerc-essc.ac.uk).  I normally use hostname.domain (e.g. 
>> bdan14.nerc-essc.ac.uk) when creating volumes or adding bricks as 
>> well, but for the last brick I added I just used the hostname 
>> (bdan14).  I can do "ping bdan14" from all the servers and clients, 
>> and the only access to the volume from outside my subnetwork is via NFS.
>>
>> 3) I found some old GlusterFS client processes still running, probably 
>> left over from previous occasions when the volume was auto-mounted.  I 
>> have seen this before and I don't know why it happens, but normally I 
>> just kill unwanted glusterfs processes without affecting the mount.
>>
>> 4) I recently started using more than one server to export the volume 
>> via NFS in order to spread the load. In other words, two NFS clients 
>> may mount the same volume exported from two different servers. I don't 
>> remember reading anywhere that this is not allowed, but as this is a 
>> recent change I thought it would be worth checking.
>>
>> 5) I normally let people carry on using a volume while a fix-layout 
>> process is going on in the background.  I don't remember reading that 
>> this is not allowed but I thought it worth checking.  I don't do 
>> migrate-data after fix-layout because it doesn't work on my cluster.  
>> Normally the fix-layout completes without error and no "mismatching 
>> layout" errors are observed.  However the volume is now so large that 
>> fix-layout usually takes several days to complete, and that means that 
>> a lot more files are created and modified during fix-layout than 
>> before.  Could the continued use of the volume during the lengthy 
>> fix-layout be causing the layout errors?
>>
>> I have run fix-layout 3 times now and the second attempt crashed.  All 
>> I can think of doing is to try again now that several back-end 
>> filesystems have been repaired.  Could any of the above factors have 
>> caused the layout errors, and can anyone suggest a better way to 
>> remove them?   All comments and suggestions would be much appreciated.
>>
>> Regards
>> Dan.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users