On Thu, Jan 21, 2016 at 7:21 PM, Keith Busch <keith.busch@xxxxxxxxx> wrote: > On Thu, Jan 21, 2016 at 05:12:13PM -0800, Linus Torvalds wrote: >> >> I assume that in this case it's simply that >> >> - max_sectors is some odd number in sectors (ie 65535) >> >> - the block size is larger than a sector (ie 4k) > > Wouldn't that make max sectors non-sensical? Or am I mistaken to think max > sectors is supposed to be a valid transfer in multiples of the physical > sector size? If the controller interface is some 16-bit register, then the maximum number of sectors you can specify is 65535. But if the disk then doesn't like 512-byte accesses, but wants 4kB or whatever, then clearly you can't actually *feed* it that maximum number. Not because it's a maximal, but because it's not aligned. But that doesn't mean that it's non-sensical. It just means that you have to take both things into account. There may be two totally independent things that cause the two (very different) rules on what the IO can look like. Obviously there are probably games we could play, like always limiting the maximum sector number to a multiple of the sector size. That would presumably work for Stefan's case, by simply "artificially" making max_sectors be 65528 instead. But I do think it's better to consider them independent issues, and just make sure that we always honor those things independently. That "honor things independently" used to happen automatically before, simply because we'd never split in the middle of a bio segment. And since each bio segment was created with the limitations of the device in mind, that all worked. Now that it splits in the middle of a vector entry, that splitting just needs to honor _all_ the rules. Not just the max sector one. >> What I think it _should_ do is: >> >> (a) check against max sectors like it used to do: >> >> if (sectors + (bv.bv_len >> 9) > queue_max_sectors(q)) >> goto split; > > This can create less optimal splits for h/w that advertise chunk size. I > know it's a quirky feature (wasn't my idea), but the h/w is very slow > to not split at the necessary alignments, and we used to handle this > split correctly. I suspect few high-performance controllers will really have big issues with the max_sectors thing. If you have big enough IO that you could hit the maximum sector number, you're already pretty well off, you might as well split at that point. So I think it's ok to split at the max sector case early. For the case of nvme, for example, I think the max sector number is so high that you'll never hit that anyway, and you'll only ever hit the chunk limit. No? So in practice it won't matter, I suspect. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html