Hi Al, I wonder if you would consider these two patches. They extend the functionality of mlockall(MCL_FUTURE) to apply to memory allocations when performing O_DIRECT io. i.e. The first read or write to an O_DIRECT file descriptor will, if MCL_FUTURE is in effect, cache any allocated memory so that it doesn't need to be allocated on subsequent reads or writes. This is needed for reliable handling of RAID metadata in userspace. When a device fails, it is necessary to record this failure in the metadata before further writes are allowed to complete. As a GFP_KERNEL allocation may block waiting for arbitrary writes to complete, we must not allow any GFP_KERNEL allocation while updating the metadata. The approach I have taken to avoiding GFP_KERNEL allocations in O_DIRECT handling is to cache the necessary data structures the first time they are allocated. There are two data structures, "struct dio" and "struct bio". I have seen a host in a memory deadlock where mdmon (which does the metadata management) was stuck waiting to allocate a 'struct dio', but couldn't until writeout was allowed to proceed - which it couldn't. I have not need a machine deadlocking waiting for a bio. That is a much less likely deadlock scenario. The bio is allocated from a mempool so the allocation will very often succeed. Exhausting the mempool is unlikely but I believe it is theoretically possible as the mempool is shared over multiple devices. Thanks, NeilBrown --- NeilBrown (2): block_dev/DIO: Optionally allocate single 'struct dio' per file. block_dev/DIO - cache one bio allocation when caching a DIO. fs/block_dev.c | 7 +++++- fs/direct-io.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++------ include/linux/fs.h | 6 +++++ 3 files changed, 66 insertions(+), 8 deletions(-) -- Signature -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html