On Jun 13, 2024 / 15:30, Ofir Gal wrote: > A bug in md-bitmap has been discovered by setting up a md raid on top of > nvme-tcp devices that has optimal io size larger than the allocated > bitmap. > > The following test reproduce the bug by setting up a md-raid on top of > nvme-tcp device over ram device that sets the optimal io size by using > dm-stripe. > > Signed-off-by: Ofir Gal <ofir.gal@xxxxxxxxxxx> > --- > This is my first contribution to blktests. The tests hangs when being > ran on a kernel without the bugfix. Should we wait for the bugfix to be > merged to upstream before merging the test? Thank you for the contribution :) If the bug does not cause any hang or failure of other following test case runs, I think it's ok to add this test case to the blktests before the fix gets merged to the upstream. Please find my in-line comments below. Later on, I will do trial runs. > > common/brd | 28 +++++++++++++++++ > tests/md/001 | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ > tests/md/001.out | 2 ++ > tests/md/rc | 12 ++++++++ > 4 files changed, 122 insertions(+) > create mode 100644 common/brd > create mode 100755 tests/md/001 > create mode 100644 tests/md/001.out > create mode 100644 tests/md/rc > > diff --git a/common/brd b/common/brd > new file mode 100644 > index 0000000..b8cd4e3 > --- /dev/null > +++ b/common/brd > @@ -0,0 +1,28 @@ > +#!/bin/bash > +# SPDX-License-Identifier: GPL-3.0+ > +# Copyright (C) 2024 Ofir Gal > +# > +# brd helper functions > + > +. common/shellcheck > + > +_have_brd() { > + _have_driver brd This _have_driver() call checks brd driver is available, but it does not check brd driver is loadable. I think _init_brd() and _cleanup_brd() assume that brd driver is loadable. For that check, please do "_have_module brd" instead. One left work is to improve this case work with built-in brd driver, because some blktests users prefer running tests with built-in drivers. At this point, I think it's okay to go with loadable brd driver. > +} > + > +_init_brd() { > + # _have_brd loads brd, we need to wait a bit for brd to be not in use in > + # order to reload it > + sleep 0.2 > + > + if ! modprobe -r brd || ! modprobe brd "$@" ; then > + echo "failed to reload brd with args: $*" > + return 1 > + fi > + > + return 0 > +} > + > +_cleanup_brd() { > + modprobe -r brd > +} > diff --git a/tests/md/001 b/tests/md/001 > new file mode 100755 > index 0000000..d5fb755 > --- /dev/null > +++ b/tests/md/001 > @@ -0,0 +1,80 @@ > +#!/bin/bash > +# SPDX-License-Identifier: GPL-3.0+ > +# Copyright (C) 2024 Ofir Gal > +# > +# Regression test for patch "md/md-bitmap: fix writing non bitmap pages" and > +# for patch "nvme-tcp: use sendpages_ok() instead of sendpage_ok()" > + > +. tests/md/rc > +. tests/nvme/rc I want to avoid cross references acoss test groups. So far, all test groups do not have any cross reference to keep them independent. How about to add this test case to the nvme test group? > +. common/brd > + > +DESCRIPTION="Create a raid with bitmap on top of nvme device with > +optimal-io-size over bitmap size" This descrption is printed as blktests runs. All other blktests have single line description then the two lines description looks strange. Can we make it shorter to fit in one line? > +QUICK=1 > + > +#restrict test to nvme-tcp only > +nvme_trtype=tcp > +nvmet_blkdev_type="device" > + > +requires() { > + # Require dm-stripe > + _have_program dmsetup > + _have_driver dm-mod > + > + _require_nvme_trtype tcp > + _have_brd > +} > + > +# Sets up a brd device of 1G with optimal-io-size of 256K > +setup_underlying_device() { > + if ! _init_brd rd_size=1048576 rd_nr=1; then > + return 1 > + fi > + > + dmsetup create ram0_big_optio --table \ > + "0 $(blockdev --getsz /dev/ram0) striped 1 512 /dev/ram0 0" > +} > + > +cleanup_underlying_device() { > + dmsetup remove ram0_big_optio > + _cleanup_brd > +} > + > +# Sets up a local host nvme over tcp > +setup_nvme_over_tcp() { > + _setup_nvmet > + > + local port > + port="$(_create_nvmet_port "${nvme_trtype}")" > + > + _create_nvmet_subsystem "blktests-subsystem-0" "/dev/mapper/ram0_big_optio" > + _add_nvmet_subsys_to_port "${port}" "blktests-subsystem-0" > + > + _create_nvmet_host "blktests-subsystem-0" "${def_hostnqn}" > + > + _nvme_connect_subsys --subsysnqn "blktests-subsystem-0" > + nvmedev=$(_find_nvme_dev "blktests-subsystem-0") > +} > + > +cleanup_nvme_over_tcp() { > + _nvmet_target_cleanup --subsysnqn "blktests-subsystem-0" > +} > + > +test() { > + echo "Running ${TEST_NAME}" > + > + setup_underlying_device > + setup_nvme_over_tcp > + > + # Hangs here without the fix > + mdadm -q --create /dev/md/blktests_md --level=1 --bitmap=internal \ In the past, short options caused some trouble. I suggest to use the long option "--quiet" in place of the short option "-q". > + --bitmap-chunk=1024K --assume-clean --run --raid-devices=2 \ > + /dev/"${nvmedev}"n1 missing > + > + mdadm -q --stop /dev/md/blktests_md Ditto. > + cleanup_nvme_over_tcp > + cleanup_underlying_device > + > + echo "Test complete" > +} > diff --git a/tests/md/001.out b/tests/md/001.out > new file mode 100644 > index 0000000..edd2ced > --- /dev/null > +++ b/tests/md/001.out > @@ -0,0 +1,2 @@ > +Running md/001 > +Test complete > diff --git a/tests/md/rc b/tests/md/rc > new file mode 100644 > index 0000000..d492579 > --- /dev/null > +++ b/tests/md/rc > @@ -0,0 +1,12 @@ > +#!/bin/bash > +# SPDX-License-Identifier: GPL-3.0+ > +# Copyright (C) 2024 Ofir Gal > +# > +# Tests for md raid > + > +. common/rc > + > +group_requires() { > + _have_root > + _have_program mdadm > +} > -- > 2.45.1 >