If you squeeze out every byte won't you still have a short write? And the written data wouldn't be cut at the bad place, but it would have a weird hole or discontinuity there. -Mike On Wed, Sep 14, 2016 at 5:34 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > Right now writev() with 3-iovec array that has unmapped address in > the second element and total length less than PAGE_SIZE will write the > first segment and stop at that. Among other things, it guarantees the > short copy, and I would rather have it yeild 0-bytes write (and -EFAULT as > return value). > > All POSIX has to say about that is this (in 2.3 Error Numbers): > > [EFAULT] > Bad address. The system detected an invalid address in attempting to use > an argument of a call. The reliable detection of this error cannot be > guaranteed, and when not detected may result in the generation of a signal, > indicating an address violation, which is sent to the process. > > Note that unmapped page in the middle of a range covered already can lead to > the same kind of short write - i.e. if we have > p = mmap(0, 3*4096, PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); > munmap(p + 4096, 4096); > fd = open("/tmp/foo", O_CREAT|O_TRUNC|O_RDWR, 0777); > write(fd, p + 2048, 8192); > > write() will yield -EFAULT, not a 2Kb stored. The same will happen with > writev(fd, &(struct iovec){p + 2048, 8192}, 1); > BTW, adding lseek(fd, 2049, SEEK_SET); before that write (or writev) will > result in 2047 bytes being written by the latter. > > IOW, we do not try to squeeze every byte that can be squeezed out of the > buffer; generally, an unmapped address anywhere in PAGE_SIZE worth of data > that would go into the same page-aligned chunk of destination can result in > short write cut at the beginning of that chunk. iovec boundaries act > as barriers to short writes, mostly by accident. > > Do we need to preserve that special treatment of iovec boundaries? I would > really like to get rid of that - the current behaviour is an easy and reliable > way to trigger a short copy case in ->write_end() and those are fairly > brittle. Sure, we still need to cope with them, and I think I've got all > instances in the current mainline fixed, but they are often suboptimal. > > Objections? > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html