On Mon, Feb 15, 2021 at 3:27 AM Nicolas Boichat <drinkcat@xxxxxxxxxxxx> wrote: > > On Mon, Feb 15, 2021 at 9:12 AM Ian Lance Taylor <iant@xxxxxxxxxx> wrote: > > > > On Sun, Feb 14, 2021 at 4:38 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > > > On Fri, Feb 12, 2021 at 03:54:48PM -0800, Darrick J. Wong wrote: > > > > On Sat, Feb 13, 2021 at 10:27:26AM +1100, Dave Chinner wrote: > > > > > > > > > If you can't tell from userspace that a file has data in it other > > > > > than by calling read() on it, then you can't use cfr on it. > > > > > > > > I don't know how to do that, Dave. :) > > > > > > If stat returns a non-zero size, then userspace knows it has at > > > least that much data in it, whether it be zeros or previously > > > written data. cfr will copy that data. The special zero length > > > regular pipe files fail this simple "how much data is there to copy > > > in this file" check... > > > > This suggests that if the Go standard library sees that > > copy_file_range reads zero bytes, we should assume that it failed, and > > should use the fallback path as though copy_file_range returned > > EOPNOTSUPP or EINVAL. This will cause an extra system call for an > > empty file, but as long as copy_file_range does not return an error > > for cases that it does not support we are going to need an extra > > system call anyhow. > > > > Does that seem like a possible mitigation? That is, are there cases > > where copy_file_range will fail to do a correct copy, and will return > > success, and will not return zero? > > I'm a bit worried about the sysfs files that report a 4096 bytes file > size, for 2 reasons: > - I'm not sure if the content _can_ actually be longer (I couldn't > find a counterexample) > - If you're unlucky enough to have a partial write in the output > filesystem, you'll get a non-zero return value and probably corrupted > content. > First of all, I am in favor of fixing the regression in the kernel caused by the change of behavior in v5.3. Having said that, I don't think it is a good idea to use ANY tool to copy a zero size pseudo file. rsync doesn't copy any data either if you try to use it to copy tracing into a temp file. I think it is perfectly sane for any tool to check file size before trying to read/copy. w.r.t. short write/short read, did you consider using the off_in/off_out arguments? I *think* current kernel CFR should be safe to use as long as the tool: 1. Checks file size before copy 2. Does not try to copy a range beyond EOF 3. Pass off_in/off_out args and verify that off_in/off_out advances as expected Thanks, Amir.