On Thu, 2015-11-19 at 16:35 +0100, Hannes Reinecke wrote: > On 11/19/2015 09:23 AM, Christoph Hellwig wrote: > > It's pretty much guaranteed a block layer bug, most likely in the > > merge bios to request infrastucture where we don't obey the merging > > limits properly. > > > > Does either of you have a known good and first known bad kernel? > > Well, I have been fighting a similar issue for several months now, > albeit with multipath enabled. Haven't had much progress with this, > sadly. > Seeing that this is our distro kernel it might or might not be > related; however, as the symptoms are identical there still is a > chance that this is actually a generic block-layer problem. > > Cheers, > > Hannes We have seen this also. (e.g. req->nr_phys_segments was 3, but blk_rq_map_sg() returned 4.) I was suspicious of the patch: bio: modify __bio_add_page() to accept pages that don't start a new segment But we put some debugging code in and didn't hit it. We haven't found the problem yet, either, though. We're still looking. As Christoph said, it would seem to be a problem with the block layer merging. The API for this seems defective, in that blk_rq_map_sg() should never be returning a value indicating that it overwrote past the end of the supplied SG array and depend on the caller to check it. (We could get data corruption on another I/O if it used adjacent memory for a different SG list, for example.) -Ewan -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html