Hi, Apologies, this is going to be quite long - I'm going to provide as much info as possible. I'm running a system with ext3 fs on software RAID. The RAID set-up is as shown below: jlm@nijinsky:~$ cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid5] read_ahead 1024 sectors md0 : active raid1 hdc1[1] hda1[0] 96256 blocks [2/2] [UU] md5 : active raid1 hdk1[1] hde1[0] 976640 blocks [2/2] [UU] md6 : active raid1 hdk5[1] hde5[0] 292672 blocks [2/2] [UU] md7 : active raid1 hdk6[1] hde6[0] 1952896 blocks [2/2] [UU] md8 : active raid1 hdk7[1] hde7[0] 976640 blocks [2/2] [UU] md9 : active raid1 hdk8[1] hde8[0] 9765376 blocks [2/2] [UU] md10 : active raid0 hdk9[1] hde9[0] 12108800 blocks 4k chunks md12 : active raid5 hdk3[3] hde3[2] hdc2[1] hda2[0] 59978304 blocks level 5, 32k chunk, algorithm 2 [4/4] [UUUU] md11 : active raid1 hdk4[1] hde4[0] 170240 blocks [2/2] [UU] Now, the filesystems are set-up as shown: jlm@nijinsky:~$ df -h Filesystem Size Used Avail Use% Mounted on /dev/md5 939M 238M 653M 27% / /dev/md0 91M 23M 63M 27% /boot /dev/md6 277M 8.1M 254M 4% /tmp /dev/md7 1.8G 1.5G 360M 81% /usr /dev/md8 939M 398M 541M 43% /var /dev/md9 9.2G 5.1G 3.6G 59% /home /dev/md10 11G 1.7G 9.1G 16% /scratch /dev/md12 56G 49G 7.7G 87% /global with /etc/fstab as follows: jlm@nijinsky:~$ cat /etc/fstab # /etc/fstab: static file system information. # # <file system> <mount point> <type> <options> <dump> <pass> /dev/md0 /boot ext3 defaults,errors=remount-ro 0 1 /dev/md5 / ext3 defaults,errors=remount-ro 0 1 /dev/md6 /tmp ext3 defaults,errors=remount-ro 0 1 /dev/md7 /usr ext3 defaults,errors=remount-ro 0 1 /dev/md8 /var ext2 defaults,errors=remount-ro 0 1 /dev/md9 /home ext3 defaults,errors=remount-ro 0 1 /dev/md10 /scratch ext3 defaults,errors=remount-ro 0 1 /dev/md11 none swap sw 0 0 /dev/md12 /global ext3 defaults,errors=remount-ro 0 1 /dev/sr0 /dvdrom iso9660 defaults,noauto,ro,user 0 0 proc /proc proc defaults 0 0 Lastly, I'm running a 2.4.17 kernel. The machine itself is a Duron800 system with a VIA chipset and the drives are connected as follows (all kernel modules are compiled in for the hardware - as is ext3): Mainboard Primary: 20Gb Mainboard Secondary: 20Gb Promise card 1: 40Gb Promise card 2: 40Gb Now the problem. Since running on ext3 the /var fs kept switching to RO mode. I emailed this list some time ago but haven't had a chance to test fully. As a stopgap method I switched /var back to ext2. I've since switched back to ext3 on /var and had trouble again (I thought that upping the kernel might show a difference - so I'm now on 2.4.17). Basically, in my logs I keep getting: Feb 27 08:22:05 nijinsky kernel: attempt to access beyond end of device Feb 27 08:22:05 nijinsky kernel: 09:07: rw=2, want=251691012, limit=1952896 Feb 27 22:15:46 nijinsky kernel: attempt to access beyond end of device Feb 27 22:15:46 nijinsky kernel: 09:08: rw=2, want=447774724, limit=976640 Feb 28 07:35:53 nijinsky kernel: attempt to access beyond end of device Feb 28 07:35:53 nijinsky kernel: 09:07: rw=2, want=251691012, limit=1952896 These were the 'errors' - but the last error seemed to trip the /var fs into ro mode. I unmounted the /var partition and ran fsck. Here was the result: nijinsky:/home/jlm# fsck /dev/md8 fsck 1.25 (20-Sep-2001) e2fsck 1.25 (20-Sep-2001) /dev/md8: recovering journal /dev/md8 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Inode 3905 has illegal block(s). Clear<y>? yes Illegal block #2 (31285248) in inode 3905. CLEARED. Illegal block #4 (164077568) in inode 3905. CLEARED. Inode 3905, i_size is 2914, should be 24576. Fix<y>? yes Inode 3905, i_blocks is 8, should be 24. Fix<y>? yes Inode 18145 has illegal block(s). Clear<y>? yes Illegal block #2 (134492160) in inode 18145. CLEARED. Illegal block #4 (74215424) in inode 18145. CLEARED. Inode 18145, i_size is 2290, should be 24576. Fix<y>? yes Inode 18145, i_blocks is 8, should be 24. Fix<y>? yes Deleted inode 30895 has zero dtime. Fix<y>? yes Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes Inode 34342 was part of the orphaned inode list. FIXED. Inode 34343 was part of the orphaned inode list. FIXED. Inode 34541 was part of the orphaned inode list. FIXED. Inode 45930 was part of the orphaned inode list. FIXED. Inode 76328 was part of the orphaned inode list. FIXED. Inode 76679 was part of the orphaned inode list. FIXED. Inode 76699 was part of the orphaned inode list. FIXED. Inode 78881 was part of the orphaned inode list. FIXED. Inode 80865 has illegal block(s). Clear<y>? yes Illegal block #2 (111943680) in inode 80865. CLEARED. Illegal block #3 (2147487744) in inode 80865. CLEARED. Illegal block #4 (194392064) in inode 80865. CLEARED. Inode 80865, i_size is 4096, should be 24576. Fix<y>? yes Inode 80865, i_blocks is 8, should be 16. Fix<y>? yes Duplicate blocks found... invoking duplicate block passes. Pass 1B: Rescan for duplicate/bad blocks Duplicate/bad block(s) in inode 8: 4096 Duplicate/bad block(s) in inode 3905: 4096 4096 Duplicate/bad block(s) in inode 18145: 4096 4096 Duplicate/bad block(s) in inode 80865: 4096 Pass 1C: Scan directories for inodes with dup blocks. Pass 1D: Reconciling duplicate blocks (There are 4 inodes containing duplicate/bad blocks.) File /spool/squid/07/59 (inode #80865, mod time Thu Jul 26 19:16:49 2001) has 1 duplicate block(s), shared with 3 file(s): <The journal inode> (inode #8, mod time Fri Feb 15 22:33:11 2002) /spool/news/message.id/566/<q1fd8.10136$Ah1.912475@news2-win.server.ntlworld.com> (inode #18145, mod time Thu Feb 21 23:16:07 2002) /spool/squid/00/20/000020DE (inode #3905, mod time Sat Feb 2 00:05:26 2002) Clone duplicate/bad blocks<y>? yes File /spool/news/message.id/566/<q1fd8.10136$Ah1.912475@news2-win.server.ntlworld.com> (inode #18145, mod time Thu Feb 21 23:16:07 2002) has 2 duplicate block(s), shared with 3 file(s): <The journal inode> (inode #8, mod time Fri Feb 15 22:33:11 2002) /spool/squid/07/59 (inode #80865, mod time Thu Jul 26 19:16:49 2001) /spool/squid/00/20/000020DE (inode #3905, mod time Sat Feb 2 00:05:26 2002) Clone duplicate/bad blocks<y>? yes File /spool/squid/00/20/000020DE (inode #3905, mod time Sat Feb 2 00:05:26 2002) has 2 duplicate block(s), shared with 3 file(s): <The journal inode> (inode #8, mod time Fri Feb 15 22:33:11 2002) /spool/squid/07/59 (inode #80865, mod time Thu Jul 26 19:16:49 2001) /spool/news/message.id/566/<q1fd8.10136$Ah1.912475@news2-win.server.ntlworld.com> (inode #18145, mod time Thu Feb 21 23:16:07 2002) Clone duplicate/bad blocks<y>? yes File <The journal inode> (inode #8, mod time Fri Feb 15 22:33:11 2002) has 1 duplicate block(s), shared with 3 file(s): /spool/squid/07/59 (inode #80865, mod time Thu Jul 26 19:16:49 2001) /spool/news/message.id/566/<q1fd8.10136$Ah1.912475@news2-win.server.ntlworld.com> (inode #18145, mod time Thu Feb 21 23:16:07 2002) /spool/squid/00/20/000020DE (inode #3905, mod time Sat Feb 2 00:05:26 2002) Duplicated blocks already reassigned or cloned. Pass 2: Checking directory structure Directory inode 80865 has an unallocated block #2. Allocate<y>? yes Directory inode 80865 has an unallocated block #3. Allocate<y>? yes Directory inode 80865 has an unallocated block #4. Allocate<y>? yes Directory inode 80865, block 5, offset 0: directory corrupted Salvage<y>? yes Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: -14840 -71993 -99016 -165850 -165876 -165877 -165878 -165879 Fix<y>? yes Free blocks count wrong for group #0 (19705, counted=19698). Fix<y>? yes Free blocks count wrong for group #2 (20689, counted=20690). Fix<y>? yes Free blocks count wrong for group #3 (22696, counted=22697). Fix<y>? yes Free blocks count wrong for group #5 (22117, counted=22122). Fix<y>? yes Inode bitmap differences: -30895 -34342 -34343 -34541 -45930 -76328 -76679 -76699 -78881 Fix<y>? yes Free inodes count wrong for group #2 (10536, counted=10540). Fix<y>? yes Free inodes count wrong for group #3 (10676, counted=10677). Fix<y>? yes Free inodes count wrong for group #5 (10519, counted=10523). Fix<y>? yes Free inodes count wrong (85120, counted=85129). Fix<y>? yes /dev/md8: ***** FILE SYSTEM WAS MODIFIED ***** /dev/md8: 36983/122112 files (0.5% non-contiguous), 105372/244160 blocks Now, when the system was running under ext2 mode - the fs tripped into ro mode once in three months. Under ext3 it seems to be daily (the erros in the syslog are more frequent). I don't think that it is hardware/disks as the other partitions have all been OK - and they are spread across the same physical disks. It is only /var that seems to be affected - and to me it seems like the machine is under load when it trips. I'm looking here for suggestions/diagnosis. I realise that this might not be an ext3 problem but the problem has manifested itself since swtiching to ext3 (but, once under ext2 also - but ext3 seems to make the system fail more often). I'm also hoping that somebody can make sense of the numbers in the errors and the fsck log. If I can provide any more information let me know. Any comments appreciated. Many thanks in advance, John.