On Tue, May 23, 2017 at 01:01:06PM +0200, Gionatan Danti wrote: > On 23/05/2017 12:56, Gionatan Danti wrote:> Does a full thin pool *really* > report a ENOSPC? On all my tests, I > > simply see "buffer i/o error on dev" on dmesg output (see below). > > Ok, I forget to attach the debug logs :p > > This is my initial LVM state: > [root@blackhole tmp]# lvs > LV VG Attr LSize Pool Origin Data% Meta% Move Log > Cpy%Sync Convert > root vg_system -wi-ao---- 50.00g > swap vg_system -wi-ao---- 7.62g > thinpool vg_system twi-aot--- 1.00g 1.51 0.98 > thinvol vg_system Vwi-aot--- 2.00g thinpool 0.76 > [root@blackhole tmp]# lvchange vg_system/thinpool --errorwhenfull=y > Logical volume vg_system/thinpool changed. > > I create an XFS filesystem on /dev/vg_system/thinvol and mounted it under > /mnt/storage. Then I filled it: > > [root@blackhole tmp]# dd if=/dev/zero of=/mnt/storage/disk.img bs=1M > count=2048 oflag=sync > dd: error writing ‘/mnt/storage/disk.img’: Input/output error Aha, you are using sync flag, that's why you are getting IO errors instead of ENOSPC, I don't remember from the top of my mind why exactly, it's been a while since I started to work on this XFS and dm-thin integration, but IIRC, the problem is that XFS reserves the data required, and don't expect to get an ENOSPC once the device "have space", and when the sync occurs, kaboom. I should take a look again on it. > [ 3005.331830] XFS (dm-6): Mounting V5 Filesystem > [ 3005.443769] XFS (dm-6): Ending clean mount > [ 5891.595901] device-mapper: thin: Data device (dm-3) discard unsupported: > Disabling discard passdown. > [ 5970.314062] device-mapper: thin: 253:4: reached low water mark for data > device: sending event. > [ 5970.358234] device-mapper: thin: 253:4: switching pool to > out-of-data-space (error IO) mode > [ 5970.358528] Buffer I/O error on dev dm-6, logical block 389248, lost > async page write > [ 5970.358546] Buffer I/O error on dev dm-6, logical block 389249, lost > async page write > async page write > [ 5970.358577] Buffer I/O error on dev dm-6, logical block 389255, lost > async page write > [ 5970.358583] Buffer I/O error on dev dm-6, logical block 389256, lost > async page write > [ 5970.358594] Buffer I/O error on dev dm-6, logical block 389257, lost > async page write > > This appears as a "normal" I/O error, right? Or I am missing something? Yeah, I don't remember exactly the details from this part of the problem, but yes, looks like you are also hitting the problem I've been working on, which basically makes XFS spinning indefinitely on xfsaild, trying to retry the buffers which failed, but, can't because they are flush locked. It basically have all data committed to AIL but can't flush them to their respective place due lack of space, then, it you will keep seeing this message until it either permanent fail the buffers, you expand the dm-pool or you unmount the filesystem. Currently, in all 3 cases, XFS can hang, unless you have set 'max_retries' configuration to '0', before reproducing the problem. Which kernel version are you using? If you have the possibility, you can test my patches to fix this problem: https://www.spinics.net/lists/linux-xfs/msg06986.html It will certainly have a V3, but they shouldn't explode your system :) And more testing are always welcomed. With the patchset you will still get the errors, since the device will not have the space XFS expects it to have, but the errors will simply go away as soon as you extend the pool device allowing more space, or it will shut down the FS if you try to unmount it, instead of hang the filesystem. Cheers. -- Carlos -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html