On Fri, Jan 11, 2019 at 03:13:05PM -0600, Steve French wrote: > On Fri, Jan 11, 2019 at 7:28 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > Are you saying the SIGALRM interrupts ftruncate() and causes the ftruncate > > to fail? > > So ftruncate does not really fail (the file contents and size match on > source and target after the copy) but the scp 'fails' and the user > would be quite confused (and presumably the network stack doesn't like > this signal, which can cause disconnects etc. which in theory could > cause reconnect/data loss issues in some corner cases). You've run into the problem that userspace simply doesn't check the return value from syscalls. It's not just scp, it's every program. Looking through cifs, you seem to do a lot of wait_event_interruptible() where you maybe should be doing wait_event_killable()? > ftruncate(3, 262144000) = ? ERESTARTSYS (To be > restarted if SA_RESTART is set) > --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- > --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} --- > rt_sigreturn({mask=[ALRM]}) = 0 > ioctl(1, TIOCGWINSZ, {ws_row=51, ws_col=156, ws_xpixel=0, ws_ypixel=0}) = 0 > getpgrp() = 82563 Right ... so the code never calls ftruncate() again. Changing all of userspace is just not going to happen; maybe you could get stuff fixed in libc, but really ftruncate() should only be interrupted by a fatal signal and not by SIGALRM.