Hi,all I have a test cluster with 3 nodes which are nd09, nd10 and nd12. The cluster software is the newest branch of STABLE, the kernel is 2.6.15. In nd12: I have 11 process to sequentially write to the GFS without speed limit, each process will remove an oldest file after write finish of a newest file. 1 process to do 'ls' of the whole GFS. 200 thread to concurrently read 200 files which are written by the above processes. 5 process to do 'df' of the GFS with 0.5 second interval. In nd10: I have 1 process to write. 200 thread to read the same files in nd12. 1 process to do 'ls'. 5 process to do 'df'. In nd09: 200 thread to read the same files in nd12. 1 process to do 'ls'. 5 process to do 'df'. After about 10 hours of the test, gfs withdrawed in node nd10 and nd12, the messages were: <-- May 13 07:30:47 nd12 kernel: GFS: fsid=test:gfs-dm1.2: fatal: assertion "x <= length" failed May 13 07:30:47 nd12 kernel: GFS: fsid=test:gfs-dm1.2: function = blkalloc_internal May 13 07:30:47 nd12 kernel: GFS: fsid=test:gfs-dm1.2: file = /home/sunjw/projects/cluster.STABLE/gfs- kernel/src/gfs/rgrp.c , line = 1458 May 13 07:30:47 nd12 kernel: GFS: fsid=test:gfs-dm1.2: time = 1147476646 May 13 07:30:47 nd12 kernel: GFS: fsid=test:gfs-dm1.2: about to withdraw from the cluster May 13 07:30:47 nd12 kernel: GFS: fsid=test:gfs-dm1.2: waiting for outstanding I/O May 13 07:30:47 nd12 kernel: GFS: fsid=test:gfs-dm1.2: telling LM to withdraw May 13 07:30:49 nd12 kernel: lock_dlm: withdraw abandoned memory May 13 07:30:49 nd12 kernel: GFS: fsid=test:gfs-dm1.2: withdrawn May 13 07:30:54 nd10 kernel: GFS: fsid=test:gfs-dm1.1: jid=2: Trying to acquire journal lock... May 13 07:30:54 nd10 kernel: GFS: fsid=test:gfs-dm1.1: jid=2: Busy May 13 07:36:51 nd10 kernel: GFS: fsid=test:gfs-dm1.1: fatal: assertion "x <= length" failed May 13 07:36:51 nd10 kernel: GFS: fsid=test:gfs-dm1.1: function = blkalloc_internal May 13 07:36:51 nd10 kernel: GFS: fsid=test:gfs-dm1.1: file = /home/sunjw/projects/cluster.STABLE/gfs- kernel/src/gfs/rgrp.c , line = 1458 May 13 07:36:51 nd10 kernel: GFS: fsid=test:gfs-dm1.1: time = 1147477010 May 13 07:36:51 nd10 kernel: GFS: fsid=test:gfs-dm1.1: about to withdraw from the cluster May 13 07:36:51 nd10 kernel: GFS: fsid=test:gfs-dm1.1: waiting for outstanding I/O May 13 07:36:51 nd10 kernel: GFS: fsid=test:gfs-dm1.1: telling LM to withdraw May 13 07:36:54 nd10 kernel: lock_dlm: withdraw abandoned memory May 13 07:36:54 nd10 kernel: GFS: fsid=test:gfs-dm1.1: withdrawn May 13 01:20:05 nd09 kernel: dlm: gfs-dm1: process_lockqueue_reply id 62f203f3 state 0 May 13 01:41:09 nd09 kernel: dlm: gfs-dm1: process_lockqueue_reply id 6fa600de state 0 May 13 07:28:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=2: Trying to acquire journal lock... May 13 07:28:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=2: Looking at journal... May 13 07:28:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=2: Acquiring the transaction lock... May 13 07:28:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=2: Replaying journal... May 13 07:28:48 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=2: Replayed 160 of 532 blocks May 13 07:28:48 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=2: replays = 160, skips = 99, sames = 273 May 13 07:28:48 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=2: Journal replayed in 1s May 13 07:28:48 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=2: Done May 13 07:34:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=1: Trying to acquire journal lock... May 13 07:34:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=1: Looking at journal... May 13 07:34:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=1: Acquiring the transaction lock... May 13 07:34:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=1: Replaying journal... May 13 07:34:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=1: Replayed 6 of 71 blocks May 13 07:34:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=1: replays = 6, skips = 4, sames = 61 May 13 07:34:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=1: Journal replayed in 1s May 13 07:34:47 nd09 kernel: GFS: fsid=test:gfs-dm1.0: jid=1: Done --> The clock of 3 nodes are not in synchronization. What should be the problem? Thanks for any reply, Luckey -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster