Locking problems with Linux 4.9 with NFSD and `fs/iomap.c`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Linux folks,


There seems to be a regression in Linux 4.9 compared to 4.4. Maybe you have an idea.

The system used 4.4.38 without issues, and was updated to 4.9.23 on April 24th. Since Friday, the NFS exports where not accessible anymore. Rebooting the system into 4.9.24, 4.9.24, and 4.9.25 didn’t change anything, and the system went into the some lock right away. Booting 4.4.38 fixed the issue though.

Here is more information.

NFS doesn’t respond to a null call.

```
# tshark -r x.x
1 0.000000 141.14.16.23 -> 141.14.26.2 TCP 74 1005 → 2049 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=327954228 TSecr=0 WS=512 2 0.000245 141.14.26.2 -> 141.14.16.23 TCP 74 2049 → 1005 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1126765132 TSecr=327954228 WS=512 3 0.000269 141.14.16.23 -> 141.14.26.2 TCP 66 1005 → 2049 [ACK] Seq=1 Ack=1 Win=29696 Len=0 TSval=327954229 TSecr=1126765132
  4   0.000300 141.14.16.23 -> 141.14.26.2  NFS 110 V4 NULL Call
5 0.202652 141.14.16.23 -> 141.14.26.2 TCP 110 [TCP Retransmission] 1005 → 2049 [PSH, ACK] Seq=1 Ack=1 Win=29696 Len=44 TSval=327954432 TSecr=1126765132 6 0.410651 141.14.16.23 -> 141.14.26.2 TCP 110 [TCP Retransmission] 1005 → 2049 [PSH, ACK] Seq=1 Ack=1 Win=29696 Len=44 TSval=327954640 TSecr=1126765132 7 0.818649 141.14.16.23 -> 141.14.26.2 TCP 110 [TCP Retransmission] 1005 → 2049 [PSH, ACK] Seq=1 Ack=1 Win=29696 Len=44 TSval=327955048 TSecr=1126765132 8 1.012429 141.14.26.2 -> 141.14.16.23 TCP 74 [TCP Spurious Retransmission] 2049 → 1005 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1126766144 TSecr=327955048 WS=512 9 1.012448 141.14.16.23 -> 141.14.26.2 TCP 66 [TCP Dup ACK 3#1] 1005 → 2049 [ACK] Seq=45 Ack=1 Win=29696 Len=0 TSval=327955241 TSecr=1126765132 10 1.674650 141.14.16.23 -> 141.14.26.2 TCP 110 [TCP Retransmission] 1005 → 2049 [PSH, ACK] Seq=1 Ack=1 Win=29696 Len=44 TSval=327955904 TSecr=1126765132 11 3.060502 141.14.26.2 -> 141.14.16.23 TCP 74 [TCP Spurious Retransmission] 2049 → 1005 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1126768193 TSecr=327955904 WS=512 12 3.060521 141.14.16.23 -> 141.14.26.2 TCP 66 [TCP Dup ACK 3#2] 1005 → 2049 [ACK] Seq=45 Ack=1 Win=29696 Len=0 TSval=327957289 TSecr=1126765132 13 3.338653 141.14.16.23 -> 141.14.26.2 TCP 110 [TCP Retransmission] 1005 → 2049 [PSH, ACK] Seq=1 Ack=1 Win=29696 Len=44 TSval=327957568 TSecr=1126765132
```

``
# tshark -r x.x
1 0.000000 141.14.16.23 -> 141.14.26.2 TCP 74 690 → 2049 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=328229577 TSecr=0 WS=512 2 0.000039 141.14.26.2 -> 141.14.16.23 TCP 74 2049 → 690 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1127040480 TSecr=328229577 WS=512 3 0.000155 141.14.16.23 -> 141.14.26.2 TCP 66 690 → 2049 [ACK] Seq=1 Ack=1 Win=29696 Len=0 TSval=328229577 TSecr=1127040480
  4   0.000180 141.14.16.23 -> 141.14.26.2  NFS 110 V4 NULL Call
5 0.206937 141.14.16.23 -> 141.14.26.2 TCP 110 [TCP Retransmission] 690 → 2049 [PSH, ACK] Seq=1 Ack=1 Win=29696 Len=44 TSval=328229784 TSecr=1127040480 6 0.414925 141.14.16.23 -> 141.14.26.2 TCP 110 [TCP Retransmission] 690 → 2049 [PSH, ACK] Seq=1 Ack=1 Win=29696 Len=44 TSval=328229992 TSecr=1127040480 7 0.822928 141.14.16.23 -> 141.14.26.2 TCP 110 [TCP Retransmission] 690 → 2049 [PSH, ACK] Seq=1 Ack=1 Win=29696 Len=44 TSval=328230400 TSecr=1127040480 8 1.056498 141.14.26.2 -> 141.14.16.23 TCP 74 [TCP Spurious Retransmission] 2049 → 690 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1127041536 TSecr=328230400 WS=512
```

Looking at the stack traces of the nfsd process with

```
for p in $(pgrep 'nfsd$'); do echo $p; cat /proc/$p/stack; done
```

all but one have the trace below.


```
[<ffffffff813eded7>] call_rwsem_down_write_failed+0x17/0x30
[<ffffffff8133aa1c>] xfs_file_buffered_aio_write+0x5c/0x2d0
[<ffffffff8133ad2e>] xfs_file_write_iter+0x9e/0x150
[<ffffffff81196b50>] do_iter_readv_writev+0xa0/0xf0
[<ffffffff81197d4a>] do_readv_writev+0x18a/0x230
[<ffffffff8119803c>] vfs_writev+0x3c/0x50
[<ffffffffa02826b5>] nfsd_vfs_write+0xc5/0x280 [nfsd]
[<ffffffffa028e48a>] nfsd4_write+0x17a/0x1d0 [nfsd]
[<ffffffffa02902f9>] nfsd4_proc_compound+0x369/0x5e0 [nfsd]
[<ffffffffa027e218>] nfsd_dispatch+0xa8/0x180 [nfsd]
[<ffffffffa01a66af>] svc_process_common+0x3ff/0x580 [sunrpc]
[<ffffffffa01a696b>] svc_process+0x13b/0x1b0 [sunrpc]
[<ffffffffa027dc7d>] nfsd+0xed/0x160 [nfsd]
[<ffffffff8107b0ba>] kthread+0xca/0xe0
[<ffffffff81a97092>] ret_from_fork+0x22/0x30
[<ffffffffffffffff>] 0xffffffffffffffff
```

The one problematic one has the stack trace below.

```
[<ffffffff811f6f6c>] iomap_write_begin+0x8c/0x120
[<ffffffff811f758b>] iomap_zero_range_actor+0xeb/0x210 # oder iomap_zero_range_actor+0x90/0x210
[<ffffffff811f77e2>] iomap_apply+0xa2/0x110
[<ffffffff811f79b8>] iomap_zero_range+0x58/0x80
[<ffffffff8133a38e>] xfs_zero_eof+0x4e/0xb0
[<ffffffff8133a58d>] xfs_file_aio_write_checks+0x19d/0x1c0
[<ffffffff8133aa39>] xfs_file_buffered_aio_write+0x79/0x2d0
[<ffffffff8133ad2e>] xfs_file_write_iter+0x9e/0x150
[<ffffffff81196b50>] do_iter_readv_writev+0xa0/0xf0
[<ffffffff81197d4a>] do_readv_writev+0x18a/0x230
[<ffffffff8119803c>] vfs_writev+0x3c/0x50
[<ffffffffa02826b5>] nfsd_vfs_write+0xc5/0x280 [nfsd]
[<ffffffffa028e48a>] nfsd4_write+0x17a/0x1d0 [nfsd]
[<ffffffffa02902f9>] nfsd4_proc_compound+0x369/0x5e0 [nfsd]
[<ffffffffa027e218>] nfsd_dispatch+0xa8/0x180 [nfsd]
[<ffffffffa01a66af>] svc_process_common+0x3ff/0x580 [sunrpc]
[<ffffffffa01a696b>] svc_process+0x13b/0x1b0 [sunrpc]
[<ffffffffa027dc7d>] nfsd+0xed/0x160 [nfsd]
[<ffffffff8107b0ba>] kthread+0xca/0xe0
[<ffffffff81a97092>] ret_from_fork+0x22/0x30
[<ffffffffffffffff>] 0xffffffffffffffff
```

Here is the corresponding lines in the source code.

```
info line *0xffffffff811f6f6c Line 124 of "fs/iomap.c" starts at address 0xffffffff811f6f6c <iomap_write_begin+140> and ends at 0xffffffff811f6f70 <iomap_write_begin+144>. info line *0xffffffff811f758b Line 345 of "fs/iomap.c" starts at address 0xffffffff811f758b <iomap_zero_range_actor+235> and ends at 0xffffffff811f758f <iomap_zero_range_actor+239>. info line *0xffffffff811f7530 Line 385 of "fs/iomap.c" starts at address 0xffffffff811f7530 <iomap_zero_range_actor+144> and ends at 0xffffffff811f7538 <iomap_zero_range_actor+152>.
```

Here is the code.

```
[…]
109 static int
110 iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
111                 struct page **pagep, struct iomap *iomap)
112 {
113         pgoff_t index = pos >> PAGE_SHIFT;
114         struct page *page;
115         int status = 0;
116
117         BUG_ON(pos + len > iomap->offset + iomap->length);
118
119         if (fatal_signal_pending(current))
120                 return -EINTR;
121
122 page = grab_cache_page_write_begin(inode->i_mapping, index, flags);
123         if (!page)
124                 return -ENOMEM;
125
126 status = __block_write_begin_int(page, pos, len, NULL, iomap);
127         if (unlikely(status)) {
128                 unlock_page(page);
129                 put_page(page);
130                 page = NULL;
131
132                 iomap_write_failed(inode, pos, len);
133         }
134
135         *pagep = page;
136         return status;
137 }
[…]
```

So it looks like insufficent memory? But the machine has 1 TB of RAM, and I didn’t see any suspicious running `htop`. Also it should have returned from with that error, shouldn’t it have?


Kind regards,

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux