Since seq_read_iter looks mature enough to be used for /proc/<pid>/maps, re-allow applications to perform zero-copy data forwarding from it. Some executable-inspecting tools (e.g. pwntools) rely on patching entry point instructions with minimal machine code that uses sendfile to read /proc/self/maps to stdout. The sendfile call allows them to do it without excessive allocations, which would change the mappings, and therefore distort the information. This is inspired by the series by Cristoph Hellwig (linked). Link: https://lore.kernel.org/lkml/20201104082738.1054792-1-hch@xxxxxx/ Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") Cc: Alexey Dobriyan <adobriyan@xxxxxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Cc: Christoph Hellwig <hch@xxxxxx> Signed-off-by: Arkadiusz Kozdra (Arusekk) <arek_koz@xxxxx> --- v3: - Only commit message changed. - Clarify what tools use this. - Do not mention performance. The average execution time of a patched static ELF outputting to a pipe (the use case of pwntools inspecting mappings of an executable) was varying both before and after ca. 3.43ms +-0.05ms (I decided that the performance impact is not worth mentioning in the commit message). I think that the change should probably marginally improve speed, but it will most likely also affect the memory footprint and as such likely minimally decrease power consumption (I suppose it would only be measurable when the mappings description grows many pages long). Speed might be more affected in pathological cases like a close-to-OOM scenario, but I was unable to test this reliably. I did my tests under qemu-system-x86_64 on a Ryzen 4500 host without kvm, with default kernel config. If someone wants to also test this feature of pwntools for themselves, it can be used as follows and should print something other than None: $ pip install pwntools $ python3 >>> from pwn import * >>> print(ELF("/bin/true").libc) Sorry for the delay, but it took me much time to figure out some low-overhead timing methods. Does this change need selftests? It looks like it should never break again if it only uses common code hopefully tested elsewhere. fs/proc/task_mmu.c | 3 ++- fs/proc/task_nommu.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index e862cab69583..06282294ddb8 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -351,7 +351,8 @@ static int pid_maps_open(struct inode *inode, struct file *file) const struct file_operations proc_pid_maps_operations = { .open = pid_maps_open, - .read = seq_read, + .read_iter = seq_read_iter, + .splice_read = generic_file_splice_read, .llseek = seq_lseek, .release = proc_map_release, }; diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c index a6d21fc0033c..e55e79fd0175 100644 --- a/fs/proc/task_nommu.c +++ b/fs/proc/task_nommu.c @@ -295,7 +295,8 @@ static int pid_maps_open(struct inode *inode, struct file *file) const struct file_operations proc_pid_maps_operations = { .open = pid_maps_open, - .read = seq_read, + .read_iter = seq_read_iter, + .splice_read = generic_file_splice_read, .llseek = seq_lseek, .release = map_release, }; -- 2.31.1