Hi, On Thu, 2007-10-04 at 08:53 +0200, Arthur MEßNER wrote: > First i build a xen disk image on the local storage fs, > then want "cp" this disk image to the gfs2 filesystem. > > > cp disk.img /xenfs/storage1/xenmachine/ > > this filesystem ( /xenfs/storage1 )is gfs2 with option -j 10, lock_dlm > > mount option is : noatime,quota=off > > I have done this sevreal times on one node > of my two node cluster, no problem. > > Yesterday i added another node ( the third ), > an on this node the same procedure ended up with this. > > Oct 4 07:44:15 xen03 kernel: GFS2: fsid=xen:storage1.1: fatal: > assertion "x <= length" failed > Oct 4 07:44:15 xen03 kernel: GFS2: fsid=xen:storage1.1: function = > rgblk_search, file = fs/gfs2/rgrp.c, line = 1116 > Oct 4 07:44:15 xen03 kernel: GFS2: fsid=xen:storage1.1: about to > withdraw this file system > Oct 4 07:44:15 xen03 kernel: GFS2: fsid=xen:storage1.1: telling LM to > withdraw > Oct 4 07:44:45 xen03 kernel: GFS2: fsid=xen:storage1.1: withdrawn > > On this node and this gfs2 filesystem the crash is reproduceable. > i never tried it on another node, because they are productive. > > After reboot, the gfs2 can normally be accessed. > The same if i try with "dd if=xxx.img of=xxx.img" > > Any suggestion, where the problem is ? > locking, gfs2 options .... > This message means that when GFS2 tried to allocate some blocks it couldn't find any in the resource group it had previously selected and in which it has previously reserved some blocks. The reason that this appears only to affect a single node is that GFS2 tries to keep resource groups local to a single node where it can to avoid having to pass the lock (and hence also the cache) of the resource group about the cluster (which is inefficient). So this may show up on the other nodes in the case that the filesystem gets closer to being full (which increases the chance that the other nodes will search this resource group). I'd suggest in the first instance running GFS2's fsck in order to be certain that its a problem on the disk, but thats what it looks like to me. It is probably just the summary information which is out of line with the actual bitmaps on that resource group, so I wouldn't expect to see any data loss. Do you know what kind of fs activity caused that in the first place? I can't see anything else that you are doing wrong, but I wonder which kernel version you are using? Steve. > > > > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster