Re: [PATCH blktests v3 4/4] Add a test that triggers the blk_mq_realloc_hw_ctxs() error path

Bart Van Assche <bvanassche@xxxxxxx> · Tue, 24 Mar 2020 19:10:43 -0700

On 2020-03-24 03:41, Daniel Wagner wrote:
> Hi Bart,
> 
> On Mon, Mar 23, 2020 at 08:09:27PM -0700, Bart Van Assche wrote:
>> On 2020-03-23 04:29, Daniel Wagner wrote:
>>> On Fri, Mar 20, 2020 at 03:24:13PM -0700, Bart Van Assche wrote:
>>>> +test() {
>>>> +	local i sq=/sys/kernel/config/nullb/nullb0/submit_queues
>>>> +
>>>> +	: "${TIMEOUT:=30}"
>>>> +	if ! _init_null_blk nr_devices=0 queue_mode=2 "init_hctx=$(nproc),100,0,0"; then
> 
>>From the kernel code:
> 
> 	/* "<interval>,<probability>,<space>,<times>" */
> 
> Don't you need to set the times attribute to -1 in order to inject the
> everytime the interval is reached? If I understood it correctly,
> with times=0 no failure is injected.
> 
> BTW, I've had to change it to init_hctx=$(($(nproc)+1)) to pass
> the initial __configure_null_blk call before the first fail hits.

Hi Daniel,

I will make both changes in the init_hctx string. Not sure how this
escaped from my attention.

>>> Doesn't make the $(nproc) the test subtil depending on the execution
>>> environment?
>> The value $(nproc) has been chosen on purpose. The following code from
>> the test script:
>>
>> +			echo 1 >$sq
>> +			nproc >$sq
>>
>> triggers (nproc + 1) calls to null_init_hctx().So injecting a failure
>> after (nproc) null_init_hctx() calls triggers the following pattern:
>> * The first blk_mq_realloc_hw_ctxs() call fails after (nproc - 1)
>> null_init_hctx() calls.
>> * The second blk_mq_realloc_hw_ctxs() call fails after (nproc - 2)
>> null_init_hctx() calls.
>> ...
>> * The (nproc) th blk_mq_realloc_hw_ctxs() call fails after one
>> null_init_hctx() call.
>> * The (nproc + 1) th blk_mq_realloc_hw_ctxs() call succeeds.
>>
>> I'm not sure to trigger this behavior without using the $(nproc) value?
> 
> Okay, I get the idea how you want to test.
> 
> Is the dependency on nproc because null_blk expects submit_queue <= online
> cpus?

That's correct. I want to test with the maximum number of submit queues
allowed, hence the use of $(nproc).

> Though why the 100?
> 
> 		for ((i=0;i<100;i++)); do
> 			echo 1 >$sq
> 			nproc >$sq
> 		done

No particular reason other than "a significant number of iterations".

> And shouldn't be there a test for error?

All I want to test is the absence of kernel crashes. The blktests
framework already inspects dmesg for the absence of kernel crashes. So I
don't think that I have to check whether or not the quoted sysfs writes
succeed.

Thanks,

Bart.