Re: [PATCH blktests] tests/rnbd: Implement RNBD regression test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Shinichiro,


On 24/12/2024 10:47, Shinichiro Kawasaki wrote:
> On Dec 23, 2024 / 13:45, Li Zhijian wrote:
>> This test case has been in my possession for quite some time.
>> I am upstreaming it now because it has once again detected a regression in
>> a recent kernel release [1].
>>
>> It's just stupid to connect and disconnect RNBD on localhost and expect
>> no dmesg exceptions, with some attempts actually succeeding.
>>
>> Please note that currently, only RTRS over RXE is supported.
>>
>> [1] https://lore.kernel.org/linux-rdma/20241223025700.292536-1-lizhijian@xxxxxxxxxxx/
> 
> Hello Li, thank you for the patch. I ran the test case that this patch adds with
> the kernel v6.13-rc4 KASAN enabled. I observed "BUG: KASAN: slab-use-after-free
> in __list_add_valid_or_report". I confirmed that your kernel patch avoids
> the message. So, the new blktests test case catches a kernel BUG. Good.

Thanks for the testing

> 
> Having said that, even with the kernel fix patch, still I observe the test
> case failure in my QEMU test environment. The count j is almost always zero.
> I once observed the count became 3, but it is far smaller than the criteria 10.
> I guess test environment performance factors might affect the pass/fail results.
> Do we really need to check the start/stop success ratio? 

Good question
I often observed ~50% success in my QEMU environment, zero success is not expected IMO.


I think the BUG can be
> caught regardless of the check.
> 

Yes, this is true.


> Also, please find my other reivew comments in line.
> 
>>
>> Signed-off-by: Li Zhijian <lizhijian@xxxxxxxxxxx>
>> ---
>> Copy to the RDMA/rtrs guys:
>>
>> Cc: Jack Wang <jinpu.wang@xxxxxxxxx>
>> Cc: Jason Gunthorpe <jgg@xxxxxxxx>
>> Cc: Leon Romanovsky <leon@xxxxxxxxxx>
>> Cc: "Md. Haris Iqbal" <haris.iqbal@xxxxxxxxx>
>> ---
>>   tests/rnbd/001     | 37 ++++++++++++++++++++++++++++
>>   tests/rnbd/001.out |  2 ++
>>   tests/rnbd/rc      | 60 ++++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 99 insertions(+)
>>   create mode 100755 tests/rnbd/001
>>   create mode 100644 tests/rnbd/001.out
>>   create mode 100644 tests/rnbd/rc
>>
>> diff --git a/tests/rnbd/001 b/tests/rnbd/001
>> new file mode 100755
>> index 000000000000..220468f0f5b4
>> --- /dev/null
>> +++ b/tests/rnbd/001
>> @@ -0,0 +1,37 @@
>> +#!/bin/bash
>> +# SPDX-License-Identifier: GPL-3.0+
>> +# Copyright 2024, Fujitsu
>> +#
> 
> I suggest to describe shortly here that this test can catch the kernel
> regression, in same manner as the commit message.

Ok, I will update it.


> 
>> +. tests/rnbd/rc
>> +
>> +DESCRIPTION="Start Stop RNBD"
>> +CHECK_DMESG=1
> 
> Do you expect this test completes quickly (around 30 seconds)? If so,
> I suggest to add QUICK=1 here.

Yes, QUICK=1 is fine.(It finishes in 15 seconds)


> 
>> +
>> +requires() {
>> +	_have_rnbd
> 
> I suggest to add "_have_loop" here.

Ok.


> 
>> +}
>> +
>> +test_start_stop()
>> +{
>> +	_setup_rnbd || return
>> +
>> +	local loop_dev i j
>> +	loop_dev="$(losetup -f)"
>> +
>> +	for ((i=0;i<100;i++))
>> +	do
>> +		_start_rnbd_client "${loop_dev}" &>/dev/null &&
>> +		_stop_rnbd_client &>/dev/null && ((j++))>> +	done
>> +
>> +	# We expect at least 10% start/stop successfully

I will consider the ratio again.



>> +	if [[ $j -lt 10 ]]; then
>> +		echo "Failed: $j/$i"
>> +	fi
>> +}
>> +
>> +test() {
>> +	echo "Running ${TEST_NAME}"
>> +	test_start_stop
>> +	echo "Test complete"
>> +}
>> diff --git a/tests/rnbd/001.out b/tests/rnbd/001.out
>> new file mode 100644
>> index 000000000000..c1f9980d0f7b
>> --- /dev/null
>> +++ b/tests/rnbd/001.out
>> @@ -0,0 +1,2 @@
>> +Running rnbd/001
>> +Test complete
>> diff --git a/tests/rnbd/rc b/tests/rnbd/rc
>> new file mode 100644
>> index 000000000000..143ba0b59f38
>> --- /dev/null
>> +++ b/tests/rnbd/rc
>> @@ -0,0 +1,60 @@
>> +#!/bin/bash
>> +# SPDX-License-Identifier: GPL-3.0+
>> +# Copyright 2024, Fujitsu
>> +#
>> +# RNBD tests.
>> +
>> +. common/rc
>> +. common/multipath-over-rdma
>> +
>> +_have_rnbd() {
>> +	if [[ "$USE_RXE" != 1 ]]; then
>> +		SKIP_REASONS+=("Only USE_RXE=1 is supported")
>> +		return 1
>> +	fi
>> +	_have_driver rdma_rxe || return
>> +	_have_driver rnbd_server || return
>> +	_have_driver rnbd_client
>> +}
>> +
>> +_setup_rnbd() {
>> +	modprobe -q rnbd_server
>> +	modprobe -q rnbd_client
>> +	start_soft_rdma
> 
> I was not able to find stop_soft_rdma() in this patch. It might be the better to
> add _cleanup_rnbd() to call stop_soft_rdma().

Good catch, it sounds good to me.


> 
>> +	for i in $(rdma_network_interfaces)
>> +	do
>> +		ipv4_addr=$(get_ipv4_addr "$i")
>> +		if [[ -n "${ipv4_addr}" ]]; then
>> +			def_traddr=${ipv4_addr}
>> +		fi
>> +	done
>> +}
>> +
>> +_start_rnbd_client() {
>> +	local a b
>> +	local blkdev=$1
>> +	local before after
>> +
>> +	before=$(ls -d /sys/block/rnbd* 2>/dev/null)
>> +	if ! echo "sessname=blktest path=ip:$def_traddr device_path=$blkdev" > /sys/devices/virtual/rnbd-client/ctl/map_device; then
>> +		return 1
>> +	fi
>> +
>> +	# Retrieve the newly added rnbd entry
>> +	after=$(ls -d /sys/block/rnbd* 2>/dev/null)
>> +	for a in $after
>> +	do
>> +		[[ -n "$before" ]] || break
>> +
>> +		for b in $before
>> +		do
>> +			[[ "$a" = "$b" ]] || break
>> +		done
>> +	done
>> +
>> +	rnbd_node=$a
> 
> Nit: this rnbd_node is a global variable. To clarify it, I suggest to use
> capital letters for its name and declare it at the beginning of this script
> file, like,
> 
> declare RNBD_NODE

Sounds good.


Thanks
Zhijian

> 
>> +}
>> +
>> +_stop_rnbd_client() {
>> +	echo "normal" > "$rnbd_node"/rnbd/unmap_device
>> +}
>> -- 
>> 2.47.0




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux