On Wed, Sep 04, 2024 at 09:43:59PM +0200, Martin Wilck wrote: > On Wed, 2024-09-04 at 21:36 +0200, Martin Wilck wrote: > > On Wed, 2024-09-04 at 14:29 -0400, Benjamin Marzinski wrote: > > > On Wed, Sep 04, 2024 at 06:12:37PM +0200, Martin Wilck wrote: > > > > On Wed, 2024-08-28 at 18:17 -0400, Benjamin Marzinski wrote: > > > > > Make the directio tests work with libcheck_pending() being > > > > > separate > > > > > from > > > > > libcheck_check > > > > > > > > > > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > > > > > > > > There's still something wrong with this test. I'm seeing lots of > > > > CI > > > > errors with your complete series applied. > > > > > > > > https://github.com/openSUSE/multipath-tools/actions?query=branch%3Atip > > > > https://github.com/openSUSE/multipath-tools/actions/runs/10704501258/job/29677643779 > > > > > > It looks like your "tip" brach is missing: > > > [PATCH 04/15] libmultipath: remove pending wait code from > > > libcheck_check calls > > > > Yeah. That patch ended up in a different mail folder, and I didn't > > notice. Weird. CI looks much better now. > > But some issues remain, e.g. > > https://github.com/openSUSE/multipath-tools/actions/runs/10708349169/job/29690448105 I'm pretty sure that due to valgrind and virtual machine induced delays, we end up waiting more than 1ms in test_check_state_async() between starting the checker at do_check_state(&c[256], 0, PATH_PENDING); and calling libcheck_pending at do_libcheck_pending(&c[256], PATH_UP); This means that we will only call get_events() once, and we won't get the IO for the c[256] which the test returns on the second call to get_events(). This would cause the error from the github CI runs (I haven't been able to reproduce this myself locally, but I haven't tried on an Ubuntu VM): [ RUN ] test_check_state_async [ ERROR ] --- 0x6 != 0x3 [ LINE ] --- directio.c:237: error: Failure! [ FAILED ] test_check_state_async Since the time it takes the test program to run is out of our hands and the checker wait time isn't configurable, I'm not sure that we can guarantee that this test will always run correctly while testing this code path without being a little hacky and manually bumping up ct->endtime so that we're sure it hasn't already passed when we call libcheck_pending(). Obviously if we took your route and did the waiting outside of libcheck_pending(), then this code path wouldn't exist and the problem would go away. I'll think on this a bit. -Ben > > Martin