(Adding the OCFS2 maintainers, since my possibly insane idea proposed below would definitely impact them!) On Tue, Aug 25, 2020 at 01:16:16PM +0100, Christoph Hellwig wrote: > On Tue, Aug 25, 2020 at 02:05:54PM +0200, Jan Kara wrote: > > Discarding blocks and buffers under a mounted filesystem is hardly > > anything admin wants to do. Usually it will confuse the filesystem and > > sometimes the loss of buffer_head state (including b_private field) can > > even cause crashes like: > > Doesn't work if the file system uses multiple devices. I think we > just really need to split the fs buffer_head address space from the > block device one. Everything else is just going to cause a huge mess. I wonder if we should go a step further, and stop using struct buffer_head altogether in jbd2 and ext4 (as well as ocfs2). This would involve moving whatever structure elements from the buffer_head struct into journal_head, and manage writeback and reads requests directly in jbd2. This would allow us to get detailed write errors back, which is currently not possible from the buffer_head infrastructure. The downside is this would be a pretty massive change in terms of LOC, since we use struct buffer_head in a *huge* number of places. If we're careful, most of it could be handled by a Coccinelle script to rename "struct buffer_head" to "struct journal_head". Fortunately, we don't actually use that much of the fs/buffer_head functions in fs/{ext4,ocfs2}/*.c. One potentially tricky bit is that ocfs2 hasn't been converted to using iomap, so it's still using __blockdev_direct_IO. So it's data blocks for DIO would still have to use struct buffer_head (which means the Coccinelle script won't really work for fs/ocfs2, without a lot of manual rework) --- or ocfs2 would have to switched to use iomap at least for DIO support. What do folks think? - Ted