On Wed, Nov 24, 2021 at 04:23:31PM +1100, NeilBrown wrote: > > It would get particularly painful if some system call started returned > -ENOMEM, which had never returned that before. I note that ext4 uses > __GFP_NOFAIL when handling truncate. I don't think user-space would be > happy with ENOMEM from truncate (or fallocate(PUNHC_HOLE)), though a > recent commit which adds it focuses more on wanting to avoid the need > for fsck. If the inode is in use (via an open file descriptor) when it is unlocked, we can't actually do the truncate until the inode is evicted, and at that point, there is no user space to return to. For that reason, the evict_inode() method is not *allowed* to fail. So this is why we need to use GFP_NOFAIL or an open-coded retry loop. The alternative would be to mark the file system corrupt, and then either remount the file system, panic the system and reboot, or leave the file system corrupted ("don't worry, be happy"). I considered GFP_NOFAIL to be the lesser of the evils. :-) If the VFS allowed evict_inode() to fail, all it could do is to put the inode back on the list of inodes to be later evicted --- which is to say, we would have to add a lot of complexity to effectively add a gigantic retry loop. Granted, we wouldn't need to be holding any locks in between retries, so perhaps it'a better than adding a retry loop deep in the guts of the ext4 truncate codepath. But then we would need to worry about userspace getting ENOMEM for system calls which historically, users have traditionally never failing. I suppose we could also solve this problem by adding retry logic in the top-level VFS truncate codepath, so instead of returning ENOMEM, we just retry the truncate(2) system call and hope that we have enough memory to succeed this time. After all, can the userspace do if truncate() fails with ENOMEM? It can fail the userspace program, which in the case of a long-running daemon such as mysqld, is basically the userspace equivalent of "panic and reboot", or it can retry truncate(2) syste call at the userspace level. Are we detecting a pattern here? There will always be cases where the choice is "panic" or "retry". - Ted