Re: [PATCH tests 2/5] tests: add a new test for rdev lifetime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 24 May 2023 17:05:43 +0800
Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

> Hi,
> 
> 在 2023/05/24 16:33, Mariusz Tkaczyk 写道:
> > On Tue, 23 May 2023 21:38:57 +0800
> > Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
> >   
> >> From: Yu Kuai <yukuai3@xxxxxxxxxx>
> >>
> >> This test add and remove a underlying disk to raid concurretly, verify
> >> that the following problem is fixed:  
> > 
> > As in previous patch, feel free to move it into separate directory.
> > 
> > This test is limited only to this particular problem you resolved because
> > you are verifying error message in dmesg. It has no additional value because
> > probability that this issue will ever more occur in the same shape is
> > minimal.
> > 
> > IMO you should check how "remove" and "add" are handled, if errors are
> > returned, if there is no trace in dmesg or if processes are not blocked in
> > kernel.  
> 
> It's a litter hard to do that, the problem is that after removing a disk
> from array, add it back might fail. But if I follow this order,
> it'll be hard to trigger the race, simply based on how quickly kernel
> finish queued work. So I just remove and add the disk concurrently, and
> return errors is not concerned as long as kernel doesn't WARN.

Ok, makes sense. All I really care about is to not grep dmesg for particular
error message. We should make verification flexible because in the future error
may come from different place in the same test. That is the main objection here.
Maybe we can do check like in generic mdadm do_test() :
dmesg | grep -iq "error\|call trace\|segfault"

I know that it is done anyway because do_test() is involved but I would prefer
to have this verification in this test anyway to make reported error message
possibility the most meaningful.

But, as we discussed in other patch, probably mdadm is not perfect place not
tests like that so please be aware of it if you will decide to add this test
to kernel.

> > You can check for this error message as a additional step at the end of test
> > but not as a mandatory test pass criteria.
> > 
> > In current form it gives as a knowledge that particular kernel doesn't have
> > your fix, that is all. Because it is race, probably it is not impacting
> > real life scenarios, so that gives a weak motivation to backport the fix
> > (only security reasons matters).
> > 
> > I don't see that this particular scenario requires test. You need to make it
> > more valuable for the future.  
> 
> This is just a regression test for a kerenl problem(also for all the
> tests in this patchset) that is solved recently, and I write this test
> from the kernel perspective, not user, I think this is the main
> difference, because I'm not quite familiar how mdadm works, I just know
> how to use it. (I still wonder why not these kernel regression tests is
> not landed in blktests)
> 
> There are more regression tests that I'm planing to add, and is this the
> wrong place to add such tests?

I think we cleared it up in patch#4.

Thanks for quick feedback,
Mariusz



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux