On 2/25/23 02:51, Keith Busch wrote:
On Fri, Feb 24, 2023 at 11:54:39PM +0000, Chaitanya Kulkarni wrote:
I do think that we should work on CDL for NVMe as it will solve some of
the timeout related problems effectively than using aborts or any other
mechanism.
That proposal exists in NVMe TWG, but doesn't appear to have recent activity.
The last I heard, one point of contention was where the duration limit property
exists: within the command, or the queue. From my perspective, if it's not at
the queue level, the limit becomes meaningless, but hey, it's not up to me.
And that is one of the issues I'd like to discuss.
As it stands CDL are defined for the controller only, queuing effects
from the transport are out of scope (for the current CDL definition).
So for NVMe-oF we would need to discuss how we can specify CDLs for
fabrics; especially the relationship between CDLs and transport timeouts
are ... interesting, and we need to discuss how we can correlate both.
Having it on the queue as you suggested would be cool as it would give a
nice overall number, but discussions with the driver vendors were not
encouraging; they're having a hard time giving timeout guarantees in
really quirky failure cases.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman