Re: [PATCH v4 04/22] libmultipath: remove pending wait code from libcheck_check calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2024-10-08 at 21:15 -0400, Benjamin Marzinski wrote:
> When the tur and directio checkers start an asynchronous checker,
> they
> now immediately return with the path in PATH_PENDING, instead of
> waiting in checker_check(). Instead the wait now happens in
> checker_get_state().
> 
> Additionally, the directio checker now waits for 1 ms, like the tur
> checker does. Also like the tur checker it now only waits once. If it
> is still pending after the first call to checker_get_state(). It will
> not wait at all on future calls, and will just process the already
> completed IOs.
> 
> Reviewed-by: Martin Wilck <mwilck@xxxxxxxx>
> Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx>
> ---
>  libmultipath/checkers.c          |  7 +++-
>  libmultipath/checkers/directio.c | 57 +++++++++++++++++-------------
> --
>  libmultipath/checkers/tur.c      | 13 +++-----
>  3 files changed, 41 insertions(+), 36 deletions(-)
> 
> diff --git a/libmultipath/checkers/directio.c
> b/libmultipath/checkers/directio.c
> index 3f88b40d..904e3071 100644
> --- a/libmultipath/checkers/directio.c
> +++ b/libmultipath/checkers/directio.c
> @@ -60,10 +60,11 @@ const char *libcheck_msgtable[] = {
>  #define LOG(prio, fmt, args...) condlog(prio, "directio: " fmt,
> ##args)
>  
>  struct directio_context {
> -	int		running;
> +	unsigned int	running;
>  	int		reset_flags;
>  	struct aio_group *aio_grp;
>  	struct async_req *req;
> +	struct timespec endtime;
>  };
>  
>  static struct aio_group *
> @@ -314,19 +315,16 @@ check_pending(struct directio_context *ct,
> struct timespec endtime)
>  static int
>  check_state(int fd, struct directio_context *ct, int sync, int
> timeout_secs)
>  {
> -	struct timespec	timeout = { .tv_nsec = 1000 };
>  	struct stat	sb;
>  	int		rc;
> +	struct io_event event;
>  	struct timespec endtime;
>  
>  	if (fstat(fd, &sb) == 0) {
>  		LOG(4, "called for %x", (unsigned) sb.st_rdev);
>  	}
> -	if (sync > 0) {
> +	if (sync > 0)
>  		LOG(4, "called in synchronous mode");
> -		timeout.tv_sec  = timeout_secs;
> -		timeout.tv_nsec = 0;
> -	}
>  
>  	if (ct->running) {
>  		if (ct->req->state != PATH_PENDING) {
> @@ -345,31 +343,26 @@ check_state(int fd, struct directio_context
> *ct, int sync, int timeout_secs)
>  			LOG(3, "io_submit error %i", -rc);
>  			return PATH_UNCHECKED;
>  		}
> +		get_monotonic_time(&ct->endtime);
> +		ct->endtime.tv_nsec += 1000 * 1000;
> +		normalize_timespec(&ct->endtime);
>  	}
>  	ct->running++;

Sorry, I have another concern here, despite my reviewed-by. This very
old statement, combined with "if (ct->running > c->timeout)" below,
somehow assumes that "running" counts seconds, which means that this
code must be called once a second. I guess this has been approximately
true in the past, when this code was originally written. It has never
been clean, we should have compared time stamps instead. 

But now, it seems to be plain wrong, "running" will never have a higher
value than 1 as it is only called once from libcheck_check(), and the
(ct->running - c->timeout) statement is broken.

Am I misreading this code?

Martin






[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux