Re: [PATCH v2 3/3] libata: use blk taging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/15/2015 12:51 PM, Dan Williams wrote:
On Thu, Jan 15, 2015 at 11:15 AM, Jens Axboe <axboe@xxxxxx> wrote:
On 01/15/2015 11:59 AM, Dan Williams wrote:
I still don't understand what we get by adding this new allocator
besides complexity, am I missing something?


Two things:

- libata tag allocator sucks. Like seriously sucks, it's almost a worst case
implementation.

Not questioning its suckiness, but I thought the SATA suckiness made
it moot.  Apparently not in all cases...

The laptop I'm typing this from does 145K 4k random read IOPS, it's definitely into the area of it mattering.

- Much better to have a single unified allocator to tweak and tune, than
having separate version.

#2 is still lacking a bit, but I don't think it'd be impossible to unify it
all.

https://bugzilla.kernel.org/show_bug.cgi?id=87101 has gone silent, I
need to ping it.  That's my primary concern with the current proposal,
supporting controllers that have weird/unnatural relationships  with
the value of the tag.

Unfortunately parts of SATA is as crappy as USB when it comes to things like that. I can understand why some controllers would like to see a natural ordering of the tags (even if it is stupid to require, but AHCI doesn't help there), but it makes very little sense why it would break others. Looks like this particular case was likely a different bug, the ordering just made it show up more easily.

And speaking of strict ordering, the blk-mq tagging should actually improve ordering. The libata implementation orders globally, but that'll equally break down on multiple processes accessing the device. For that case, you end up interleaving, and if the drive does strict by-tag ordering of what IO to do, it'll go random pretty quickly. The blk-mq implementation preserves ordering between threads in that case, due to how the last tag is cached. So I would expect to see an improvement in behavior with that for use cases that offload IO to thread pools (like posix aio, or private implementations in programs).

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux