Re: [PATCH blktests v2 0/3] Test different queue counts

Chaitanya Kulkarni <chaitanyak@xxxxxxxxxx> · Tue, 28 Mar 2023 18:20:57 +0000

On 3/28/23 01:45, Shinichiro Kawasaki wrote:
> On Mar 27, 2023 / 17:41, Daniel Wagner wrote:
>> On Thu, Mar 23, 2023 at 11:06:53AM +0000, Shinichiro Kawasaki wrote:
>>> On Mar 22, 2023 / 11:16, Daniel Wagner wrote:
>>>> Setup different queues, e.g. read and poll queues.
>>>>
>>>> There is still the problem that _require_nvme_trtype_is_fabrics also includes
>>>> the loop transport which has no support for different queue types.
>>>>
>>>> See also https://lore.kernel.org/linux-nvme/20230322002350.4038048-1-kbusch@xxxxxxxx/
>>> Hi Daniel, thanks for the patches. The new test case catches some bugs. Looks
>>> valuable.
>>>
>>> I ran the test case using various nvme_trtype on kernel v6.2 and v6.3-rc3, and
>>> observed hangs. I applied the 3rd patch in the link above on top of v6.3-rc3 and
>>> confirmed the hang disappears. I would like to wait for the kernel fix patch
>>> delivered to upstream, before adding this test case to blktests master.
>> Okay makes sense.
>>
>>> When I ran the test case without setting nvme_trtype, kernel reported messages
>>> below:
>>>
>>> [  199.621431][ T1001] nvme_fabrics: invalid parameter 'nr_write_queues=%d'
>>> [  201.271200][ T1030] nvme_fabrics: invalid parameter 'nr_write_queues=%d'
>>> [  201.272155][ T1030] nvme_fabrics: invalid parameter 'nr_poll_queues=%d'
>> BTW, I've added a '|| echo FAIL' to catch those.
>>
>>> Is it useful to run the test case with default nvme_trtype=loop?
>> No, we should run this test only for those transport which actually support the
>> different queue types. Christoph suggest to figure out before running the test
>> if it is actually supported. So my first idea was to check what options are
>> supported by reading /dev/nvme-fabrics. But this will return all options we are
>> parsed by fabrics.c but not the subset which each transport might only support.
>>
>> So to figure this out we would need to do a full setup just to figure out if it
>> is supported. I think the currently best approach would just to limit this test
>> to tcp and rdma. Maybe we could add something like
>>
>> rc:
>> _require_nvme_trtype() {
>> 	local trtype
>> 	for trtype in "$@"; do
>> 		if [[ "${nvme_trtype}" == "$trtype" ]]; then
>> 			return 0
>> 		fi
>> 	done
>> 	SKIP_REASONS+=("nvme_trtype=${nvme_trtype} is not supported in this test")
>> 	return 1
>> }
>>
>> 047:
>> requires() {
>> 	_nvme_requires
>> 	_have_xfs
>> 	_have_fio
>> 	_require_nvme_trtype tcp rdma
>> 	_have_kver 4 21
>> }
>>
>> What do you think?
> Thanks for the clarifications about the requirements. I think your approach will
> work. Having said that, we may have another potentially simpler solution:
>
> - Do not check _require_nvme_trtype and _have_kver in requires().
> - After setting nr_write_queues in test(), check if dmesg contains the error
>    "invalid parameter 'nr_write_queues" using the helper function
>    _dmesg_since_test_start().
> - If the error is reported, set SKIP_REASONS and return from test().
>    Blktests will report the test case as "not run".
>
> This approach assumes that the "invalid parameter" is printed when the test case
> should be skipped (loop transport, older kernel version).

Is it possible to not rely on dmesg unless it is absolutely required ?

> As a generic guide, SKIP_REASONS should be set in requires() before test().
> However, if the SKIP_REASONS can not be checked before test(), blktests allows
> to set it in test(). The test case block/030 is such an exception. I think your
> new test case can be another exception. With this, we do not need to repeat the
> full setup. And it might be more robust against future changes such as new
> transport types.

Ummm should we avoid creating exceptions ? unless it is absolutely 
necessary ?
The problem with exception is it becomes problematic for long term 
maintenance.

I believe currently focusing on tcp/rdma only is sufficient ...

-ck