Hardware does go bad. It can affect journaling filesystems as well. The ext3 manpage points out that there are plenty of things that can wreck your filesystem even if the software is perfect, and that you'd probably want to know if it was being slowly munched away -before- it finally goes south completely. (For example, subtle memory errors can gradually turn your FS into garbage, but it may take a while to notice. Capacitors slowly going bad on a motherboard that lead to corruption due to poor power-supply bypassing; vibration leading to submicrosecond faults in solder joints and connectors... The list is endless.) Some filesystems are far more vulnerable to single-bit errors than others, so some may fall over long before it would occur to you to run fsck if you run it "only when there's a problem". ext3 is more paranoid of bad hardware than some other popular journalling filesystems, so in fact you might be well-advised to run fsck -more- frequently on others. (Just be careful---for hilarity's sake, try copying an entire reiserfs filesystem into an ordinary file on a second reiserfs, and then run fsck on -that-. Make sure your backups are readable first; you'll need 'em. http://zork.net/~nick/mail/why-reiserfs-is-teh-sukc has details, and XFS doesn't come off too well in that, either.) For that matter, subtle hardware issues might eat the data in your files, bit by bit, and you might never notice unless it was starting to eat your metadata and you wondered why fsck kept finding small errors. It depends on whether you care whether your data might have a few bits of corruption scattered through it. I recently hit an issue where a motherboard had issues with (a) CPU throttling (flipped some RAM bits when the CPU was in "slow" mode), -and- (b) with dual-channel memory (flipped some bits, even after the throttling was turned off, if the RAM was in dual-channel mode but not if it was in single-channel mode---no version of memtest86+ was ever able to detect the corruption, but repeated runs of "fsck -n" on the (terabyte) FS yielded -different- results every time!). And this was on top of an encrypted device, so it was quite clear that it wasn't bad bits on the disk or in any of the disk datapaths. [(a) above was repeatable when reading from a USB stick and both SATA and IDE, which pretty much nailed the coffin shut and was about when I found out the problem was throttling---repeated md5sum on files with certain bit patterns yielded nondeterministic results if run on an idle machine ten seconds apart, but running them in a tight loop yielded the same results after the first few "random" results, as did nailing the CPU in another process even if the md5sums were seconds apart. But (b) didn't manifest -at all- except in fsck, and I -ran- the fsck because I thought (a) might have already trashed the filesystem and I wanted to find out whether it had.] Without fsck, I'd never have discovered the dual-channel problem until it had completely trashed the data (instead of discovering it "only" a month after installing some new RAM -and- thoroughly "testing" it with memtest86+ before putting the machine back in service---memtest86+'s tests didn't discover the dual-channel problem despite days of runtime -and- never noticed throttling issues because, of course, it runs the CPU flat-out all the time...). One thing you might want to consider is whether your backups go at least far back in time as the last time you ran fsck. If they don't, and fsck discovers that bad hardware has been corrupting your data, you're screwed. OTOH, running fsck more frequently might mean you don't need to keep complete backups quite as far back. Since this is the LVM list, I'm assuming that y'all are actually, you know, -running- LVM. And if you are, you can make a snapshot of your filesystem and run "fsck -n" -on the snapshot- so you don't even have to take the FS out of service; if it finds a serious problem, then you can dismount it and run a real fsck to fix it. Sure, the system will be slower while fsck is running, but you can ionice it and/or run it at slack(er) times or whatever else can mitigate its impact, and it doesn't matter how long it takes to run if it's running readonly off a snapshot... _______________________________________________ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/