Jens Axboe wrote: > On Mon, Nov 10 2008, Tejun Heo wrote: >> Hello, all. >> >> I went through libata-convert-to-block-tagging today and found several >> issues. >> >> 1. libata internal data structure for command context (qc) allocation is >> bound to tag allocation, which means that block layer tagging should be >> enabled for all controllers which have can_queue > 1. > > Naturally, is there a problem there? Queueing wasn't enabled for ATAPI device behind PMP so it made those devices reuse already allocated qc's. Not difficult to fix. >> 2. blk-tag offsets allocation for non-sync requests. I'm not confident >> this is safe. Till now, if there was only single command in flight for >> the port, it was guaranteed that the qc gets tag zero whether the device >> is NCQ capable or not. qc allocation is tied tightly with hardware >> command slot allocation and I don't think it's wise to change this >> assumption. >> >> #1 is easy to fix but #2 requires either adding a spinlock or two atomic >> variables to struct blk_queue_tag to keep the current behavior while >> guaranteeing that tags are used in order. Also, there's delay between >> libata marks a request complete and the request actually gets completed >> and the tag is freed. If another request gets issued inbetween, the tag >> number can't be guaranteed. This can be worked around by re-mapping tag >> number in libata depending on command type but, well then, it's worse >> than the original implementation. > > Or we could just change the blk-tag.c logic to stop of > find_first_zero_bit() returns >= some_value instead of starting at an > offset? You don't need any extra locking for that. I tried that but there's a behavior difference. If you reserve from the beginning, the sync IOs prefer the reserved slots. If you reserved from the end, the sync IO prefer non-reserved slots. ie. When 4 slots are reserved for sync IO, and 4 sync IOs are already in flight, the fifth sync IO competes with async IOs on the same ground in the former case but in the latter it either wins or is very likely to take another reserved slot. > The second part is more tricky I think, but I'm not sure there's a race > there. For normally issued IO, queueing is restarted when the tag > completes. There's a small softirq delay there, but that delay is before > the tag is completed and queueing restarted. Any non-ncq command (eg > through an ioctl) will have to wait for completion as well. For drivers with .can_queue == 1, nothing can go wrong. I was worried about NCQ -> non-NCQ transition because blk-tag doesn't know that the non-NCQ command is not to be scheduled with NCQ commands and will happily assign any tag and in this case the race window definitely is there. >> So, please revert the following commits. >> >> 43a49cbdf31e812c0d8f553d433b09b421f5d52c >> e013e13bf605b9e6b702adffbe2853cfc60e7806 >> 2fca5ccf97d2c28bcfce44f5b07d85e74e3cd18e > > It's not a big deal to me, it can wait for 2.6.29 if there really are > more issues there. But I'm not sure that your points are very valid, so > lets please discuss it a bit more :-) Yeap, I fully agree moving to blk tagging is good. If we fix the problem from #1, it's probably gonna be okay for most cases too. I'm just a little bit nervous because libata always has had this tag 0 for non-NCQ commands assumption and this conversion changes that, so I was hoping to update blk-tag such that such assumption can be guaranteed first and then convert libata to be on the safe side. Some controllers use completely different command mechanism for different protocols and it's much safer and more deterministic if same tag can be guaranteed. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html