On Sat, Mar 02, 2002 at 12:18:44PM -0500, Rechenberg, Andrew wrote: ... > We just had a pretty bad crash on one of production boxes and the ext2 > filesystem on the data partition of our box had some major filesystem > corruption. Needless to say, I am now looking into converting the > filesystem to ext3 and I have some questions regarding ext3 and Linux > software RAID. > > I have read that previously there were some issues running ext3 on a > software raid device (/dev/mdN), but that most of those issues are resolved > by running kernel 2.4.x. Currently we are running 2.4.16 on our producton > system and we have a rather complicated hardware/software RAID configuration > on the box. To a large extent what is called "ext3" should probably be called "ext2+j", or some such, but "ext3" makes sense for many things. You have probably seen my comments. I did "solve" the problem by running my dual-cpu box with uniprocessor kernel. Even a SMP kernel with "nosmp" boot option (to run it with only the boot processor) didn't work. My troublesome machine has been co-located into a place at which I don't have daily access. How much the problem lies in the gcc version (RH 7.1 2.96-97.1), and how much in the kernel (2.4.17 release), I don't know. It seems to happen when there is: - lots of memory - at least 2 _fast_ processors - process writing at full tilt a very large file (1.3+ times the memory size) I didn't ever see the problem with 128 MB Dual PPro200 machine even with same disks that latter with another motherboard/cpus did cause problems. I haven't seen the problem with my home machine which is, if possible, even heftier box that the one which I do get to hang -- except it has dual IDE disks at same IDE cable, instead of a separate dual channel IDE controller or SCSI controller. (Both of which have hung up on me at otherbox.) ... > - Is it wise to convert the filesystem on /dev/md1 to ext3? > > - Have the issues with ext3 on Linux RAID been resolved? Things I have seen in 2.4.18 test releases don't convince me. However I haven't been able to test them either. Maybe in couple weeks time, but not yet. > - Will the failing and resyncing of /dev/md1 happening on a daily basis > cause problems with the journalling? It should be invisible. Doing it at a quiet moment would be advisable, of course. Running the resync will take heaps of time, though. Consider 252 GB being synced at 15 MB/sec (even that might be a bit over optimistic with some disks), it takes "mere" 16800 seconds, or 4h 40m. Heaps of smarter internal design to support a real RAID10 (instead of RAID1 on top of a pair of RAID0) might allow syncing all disks in parallel, which of course reduces the max speed, but might achieve the sync in about an hour, or two. > - Do you think the filesystem would be stable enough for 18x7 availability? EXT3, yes. But likely you should ask RedHat for supported Enterprise kernel. There are other reasons why ReiserFS might make sense in your system, but unfortunately they don't help with backup... .. or maybe... See the site: http://devlinux.com/namesys If the reiserfs can do snapshot at filesystem level, and you take backup at filesystem level.. Then you can have heaps of small RAID1 pairs stripped together with RAID0 -- e.g. "RAID01" (or which ever way those hybrides are called..) ... but still, those don't help in case your system experiences the RAID1+EXT3 hangup which I kept seeing. When you have time, try it. "tune2fs -j /dev/md1" (Have suitable toolset online also), e2fsck, and mount with "-t ext3". Then try writing there a large file, e.g.: dd if=/dev/zero bs=1024k of=test.file count=12000 If it does not hang with a few runs, you probably are safe. > - What kind of overhead is involved after the filesystem is ext3? Aside of the journal file, it is exactly the same as EXT2. Indeed you can tune2fs a EXT2 filesystem to be EXT3. I have done that. It takes a bit to have the root to mutate itself in boot to ext3, for some reason. But you are not doing this for your boot system.. Difficulty appears only with ext2 filesystem that has been in use for a longer time with older kernels. > - What journalling mode is suggested for this type of application/system > configuration? > - What size journal would be appropriate give data=ordered vs. data=journal? > - And any other suggestions/insights/comments. For these I don't have opinnions. I have been using "save and slow" mode (data=ordered), but I am not in a very great hurry most of the time. Indeed presently one of my remotely located machines has been running RAID1 in degraded mode because 2 weeks ago one of its IDE disks did fail, and I noticed it only yesterday (lacking automated monitoring.) It seems I can get to it in a weeks time, but will it need replacing, or just a powercycle, no idea... > Below is our /etc/raidtab. Let me know if you need any more information. > Thank you in advance for all your assistance. > > Regards, > Andrew Rechenberg > Network Team, Sherman Financial Group > arechenberg@shermanfinancialgroup.com /Matti Aarnio