On Wed, Oct 09, 2024 at 05:00:55PM +0200, Martin Wilck wrote: > On Tue, 2024-10-08 at 21:15 -0400, Benjamin Marzinski wrote: > > When the tur and directio checkers start an asynchronous checker, > > they > > now immediately return with the path in PATH_PENDING, instead of > > waiting in checker_check(). Instead the wait now happens in > > checker_get_state(). > > > > Additionally, the directio checker now waits for 1 ms, like the tur > > checker does. Also like the tur checker it now only waits once. If it > > is still pending after the first call to checker_get_state(). It will > > not wait at all on future calls, and will just process the already > > completed IOs. > > > > Reviewed-by: Martin Wilck <mwilck@xxxxxxxx> > > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx> > > --- > > libmultipath/checkers.c | 7 +++- > > libmultipath/checkers/directio.c | 57 +++++++++++++++++------------- > > -- > > libmultipath/checkers/tur.c | 13 +++----- > > 3 files changed, 41 insertions(+), 36 deletions(-) > > > > diff --git a/libmultipath/checkers/directio.c > > b/libmultipath/checkers/directio.c > > index 3f88b40d..904e3071 100644 > > --- a/libmultipath/checkers/directio.c > > +++ b/libmultipath/checkers/directio.c > > @@ -60,10 +60,11 @@ const char *libcheck_msgtable[] = { > > #define LOG(prio, fmt, args...) condlog(prio, "directio: " fmt, > > ##args) > > > > struct directio_context { > > - int running; > > + unsigned int running; > > int reset_flags; > > struct aio_group *aio_grp; > > struct async_req *req; > > + struct timespec endtime; > > }; > > > > static struct aio_group * > > @@ -314,19 +315,16 @@ check_pending(struct directio_context *ct, > > struct timespec endtime) > > static int > > check_state(int fd, struct directio_context *ct, int sync, int > > timeout_secs) > > { > > - struct timespec timeout = { .tv_nsec = 1000 }; > > struct stat sb; > > int rc; > > + struct io_event event; > > struct timespec endtime; > > > > if (fstat(fd, &sb) == 0) { > > LOG(4, "called for %x", (unsigned) sb.st_rdev); > > } > > - if (sync > 0) { > > + if (sync > 0) > > LOG(4, "called in synchronous mode"); > > - timeout.tv_sec = timeout_secs; > > - timeout.tv_nsec = 0; > > - } > > > > if (ct->running) { > > if (ct->req->state != PATH_PENDING) { > > @@ -345,31 +343,26 @@ check_state(int fd, struct directio_context > > *ct, int sync, int timeout_secs) > > LOG(3, "io_submit error %i", -rc); > > return PATH_UNCHECKED; > > } > > + get_monotonic_time(&ct->endtime); > > + ct->endtime.tv_nsec += 1000 * 1000; > > + normalize_timespec(&ct->endtime); > > } > > ct->running++; > > Sorry, I have another concern here, despite my reviewed-by. This very > old statement, combined with "if (ct->running > c->timeout)" below, > somehow assumes that "running" counts seconds, which means that this > code must be called once a second. I guess this has been approximately > true in the past, when this code was originally written. It has never > been clean, we should have compared time stamps instead. Yeah. That makes sense. > > But now, it seems to be plain wrong, "running" will never have a higher > value than 1 as it is only called once from libcheck_check(), and the > (ct->running - c->timeout) statement is broken. > > Am I misreading this code? Either you are or I am. libcheck_check() is called by start_checker() each time a path comes up for checking. libcheck_check() calls check_state() with the number of seconds that the checker should wait till it times out. If ct->running is non-zero, but the path is still pending, check_state() increments ct->running and returns PATH_PENDING. libcheck_pending() is called by get_state() every time a path gets checked and its state is PATH_PENDING. If ct->running is greater than c->timeout then the checker has timed out (it assumes that ct->running is counting seconds) and gets cancelled. Also, I only see (ct->running > c->timeout), not (ct->running - c->timeout) are you talking about something else in the code that I'm missing? At any rate, I can change this to use actual timestamps instead of our ct->running assumptions. -Ben > > Martin