> On Aug 5, 2016, at 8:56 AM, Theodore Ts'o <tytso@xxxxxxx> wrote: > > On Fri, Aug 05, 2016 at 09:29:59AM +0300, Nikolay Borisov wrote: >>> The easist way to fix this is defer the ext4_commit_super() to a >>> workqueue. We only need this in the errors=continue case, and in that >>> scenario we're not in a hurry when the superblock gets written out. >> >> Is errors=continue the default option if nothing specifically is >> specified at mount time, since I don't have this set explicitly: >> >> /dev/vda / ext4 rw,relatime,data=ordered 0 0 > > Yes, it's the default. I keep wondering whether we should change the > default to remount-ro or even panic, since people sometimes don't > notice that the "file system has been corrupted" messages, and then > they can end up losing a lot more detail if we forced them to address > the issue right away. I'd definitely be in favour of making the default "errors=remount-ro". We've been setting that explicitly for years, since otherwise people may not notice their ongoing problems until the filesystem completely explodes. Related to that, there is a Lustre patch to handle inconsistencies between group descriptors and block/inode bitmaps by marking only the group as unusable for new allocations, instead of marking the whole filesystem in error. Is that something that is of interest to a wider audience? Patch against RHEL7 is attached, but could be updated for newer kernels if there is interest. Cheers, Andreas
Attachment:
ext4-corrupted-inode-block-bitmaps-handling-patches.patch
Description: Binary data
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail