On Sun, 28 Nov 2010, Lukas Kolbe wrote: > Hi, > > On our backup system (2 LTO4 drives/Tandberg library via LSISAS1068E, > Kernel 2.6.36 with the stock Fusion MPT SAS Host driver 3.04.17 on > debian/squeeze), we see reproducible tape read and write failures after > the system was under memory pressure: > > [342567.297152] st0: Can't allocate 2097152 byte tape buffer. > [342569.316099] st0: Can't allocate 2097152 byte tape buffer. > [342570.805164] st0: Can't allocate 2097152 byte tape buffer. > [342571.958331] st0: Can't allocate 2097152 byte tape buffer. > [342572.704264] st0: Can't allocate 2097152 byte tape buffer. > [342873.737130] st: from_buffer offset overflow. > > Bacula is spewing this message every time it tries to access the tape > drive: > 28-Nov 19:58 sd1.techfak JobId 2857: Error: block.c:1002 Read error on fd=10 at file:blk 0:0 on device "drv2" (/dev/nst0). ERR=Input/output error > > By memory pressure, I mean that the KVM processes containing the > postgres-db (~20million files) and the bacula director have used all > available RAM, one of them used ~4GiB of its 12GiB swap for an hour or > so (by selecting a full restore, it seems that the whole directory tree > of the 15mio files backup gets read into memory). After this, I wasn't > able to read from the second tape drive anymore (/dev/st0); whereas the > first tape drive was restoring the data happily (it is currently about > halfway through a 3TiB restore from 5 tapes). > > This same behaviour appears when we're doing a few incremental backups; > after a while, it just isn't possible to use the tape drives anymore - > every I/O operation gives an I/O Error, even a simple dd bs=64k > count=10. After a restart, the system behaves correctly until > -seemingly- another memory pressure situation occured. > This is predictable. The maximum number of scatter/gather segments seems to be 128. The st driver first tries to set up transfer directly from the user buffer to the HBA. The user buffer is usually fragmented so that one scatter/gather segment is used for each page. Assuming 4 kB page size, the maximu size of the direct transfer is 128 x 4 = 512 kB. When this fails, the driver tries to allocate a kernel buffer so that there larger than 4 kB physically contiguous segments. Let's assume that it can find 128 16 kB segments. In this case the maximum block size is 2048 kB. Memory pressure results in memory fragmentation and the driver can't find large enough segments and allocation fails. This is what you are seeing. So, one solution is to use 512 kB block size. Another one is to try to find out if the 128 segment limit is a physical limitation or just a choice. In the latter case the mptsas driver could be modified to support larger block size even after memory fragmentation. Kai -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html