> You make it sound like the heuristic decision must be made > *after* trying to clone, but it can be made before and pass > flags to the kernel whether or to fallback to copy. True, though I simplified slightly. There's other things we try first if the clone fails, like creating a hardlink. If cloning fails, we also often only want to copy a part of the file (again heuristically, whether more than what the program asked for will be useful for debugging) > copy_file_range(2) has an unused flags argument. > Adding support for flags like: > COPY_FILE_RANGE_BY_FS > COPY_FILE_RANGE_BY_KERNEL That would solve it of course, and I'd be happy with that solution, but it seems like we'd end up with just another spelling for the cloning ioctls then that have subtly different semantics. > I can also suggest a workaround for you. > If your only problem is bind mounts and if recorder is a privileged > process (CAP_DAC_READ_SEARCH) then you can use a "master" > bind mount to perform all clone operations on. > Use name_to_handle_at(2) to get sb file handle of source file. > Use open_by_handle_at(2) to get an open file descriptor of the source > file under the "master" bind mount. Thanks, that's a very valuable suggestion - I hadn't considered that. Unfortunately, I don't think the recorder does generally have those privileges. It doesn't help in my use case, since I'm recording a container that makes use of user namespaces, so nothing requires priviledge, but it does seem like it would be useful if the recorder does have appropriate capabilities (rr already has a mode where it runs with privilege, e.g. for recording setuid binaries). Keno