Re: [LSF/MM/BPF BOF] Userspace command abouts

Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx> · Sat, 25 Feb 2023 13:15:49 +0900

On 2/25/23 10:51, Keith Busch wrote:
> On Fri, Feb 24, 2023 at 11:54:39PM +0000, Chaitanya Kulkarni wrote:
>> I do think that we should work on CDL for NVMe as it will solve some of
>> the timeout related problems effectively than using aborts or any other
>> mechanism.
> 
> That proposal exists in NVMe TWG, but doesn't appear to have recent activity.
> The last I heard, one point of contention was where the duration limit property
> exists: within the command, or the queue. From my perspective, if it's not at
> the queue level, the limit becomes meaningless, but hey, it's not up to me.

Limit attached to the command makes things more flexible and easier for the
host, so personally, I prefer that. But this has an impact on the controller:
the device needs to pull in *all* commands to be able to know the limits and do
scheduling/aborts appropriately. That is not something that the device designers
like, for obvious reasons (device internal resources...).

On the other hand, limits attached to queues could lead to either a serious
increase in the number of queues (PCI space & number of IRQ vectors limits), or,
loss of performance as a particular queue with the desired limit would be
accessed from multiple CPUs on the host (lock contention). Tricky problem I
think with lots of compromises.

-- 
Damien Le Moal
Western Digital Research