On Thu 15-10-15 13:46:44, Andrew Morton wrote: > On Mon, 12 Oct 2015 14:45:23 +0200 Jan Kara <jack@xxxxxxxx> wrote: > > > Currently a simple program below issues a sendfile(2) system call which > > takes about 62 days to complete in my test KVM instance. > > Geeze some people are impatient. > > > int fd; > > off_t off = 0; > > > > fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644); > > ftruncate(fd, 2); > > lseek(fd, 0, SEEK_END); > > sendfile(fd, fd, &off, 0xfffffff); > > > > Now you should not ask kernel to do a stupid stuff like copying 256MB in > > 2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin > > should have a way to stop you. > > > > We actually do have a check for fatal_signal_pending() in > > generic_perform_write() which triggers in this path however because we > > always succeed in writing something before the check is done, we return > > value > 0 from generic_perform_write() and thus the information about > > signal gets lost. > > ah. > > > Fix the problem by doing the signal check before writing anything. That > > way generic_perform_write() returns -EINTR, the error gets propagated up > > and the sendfile loop terminates early. > > > > ... > > > > --- a/mm/filemap.c > > +++ b/mm/filemap.c > > @@ -2488,6 +2488,11 @@ again: > > break; > > } > > > > + if (fatal_signal_pending(current)) { > > + status = -EINTR; > > + break; > > + } > > + > > status = a_ops->write_begin(file, mapping, pos, bytes, flags, > > &page, &fsdata); > > if (unlikely(status < 0)) > > @@ -2525,10 +2530,6 @@ again: > > written += copied; > > > > balance_dirty_pages_ratelimited(mapping); > > - if (fatal_signal_pending(current)) { > > - status = -EINTR; > > - break; > > - } > > } while (iov_iter_count(i)); > > > > return written ? written : status; > > This won't work, will it? If user hits ^C after we've written a few > pages, `written' is non-zero and the same thing happens? It does work - I've tested it :). Sure, the generic_perform_write() call that is running when the signal is delivered will return with value > 0. But the interesting thing is what happens after that: Either we return to userspace (and then we are fine) or generic_perform_write() gets called again because there's more to write and *that* call will return -EINTR which ends up terminating the whole sendfile syscall. Actually there is one general lesson to be learned here: When you check for fatal signal and bail out, it's better to do it before doing any work. That way things keep working even if the function is called in a loop. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>