Re: scp bug due to progress indicator when copying from remote to local on Linux

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Fri, 11 Jan 2019 15:05:07 -0800

On Fri, Jan 11, 2019 at 03:50:02PM -0600, Steve French wrote:
> On Fri, Jan 11, 2019 at 3:22 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > Right ... so the code never calls ftruncate() again.  Changing all of
> > userspace is just not going to happen; maybe you could get stuff fixed in
> > libc, but really ftruncate() should only be interrupted by a fatal signal
> > and not by SIGALRM.
> 
> Looking at the places wait_event_interruptible is done I didn't see code
> in fs/cifs that would match the (presumably) code path, mostly those
> calls are in
> smbdirect (RDMA) code) - for example cifs_setattr does call
> filemap_write_and_wait
> but as it goes down into the mm layer and then to cifs_writepages and
> the SMB3 write
> code, I didn't spot a "wait_event_interruptible" in that path (I might
> have missed
> something in the mm layer).  I do see one in the cifs reconnect path,
> but that is
> not what we are typically hitting.   Any ideas how to match what we
> are blocked in when
> we get the annoying SIGALRM?  Another vague thought - is it possible
> to block SIGALRM
> across all of cifs_setattr?  If it is - why do so few (only 3!) file
> systems (ceph, jffs2, ocfs2
> ever call sigprocmask)?

You can see where a task is currently sleeping with 'cat /proc/$pid/stack'.
If you can provoke a long duration ftruncate, that'd be a good place to
start looking.