Re: fio hangs with --status-interval

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2014-07-10 00:56, Michael Mattsson wrote:
Hey,
I've got 8 identical CentOS 6.5 clients that randomly keeps hanging
fio when using --status-interval. I've tried fio 2.1.4 and fio 2.1.10
they both behave the same. I've also tried piping the output to tee
instead of redirecting to a file. I also tried --output and specified
output file, still same problem. My fio command runs through its tests
flawlessly without --status-interval and exits cleanly every time.
There could be anywhere from 0 to 5 clients that gets affected.
Running strace on the process that seem hung yields the following
output:

$ strace -p 31055
Process 31055 attached - interrupt to quit
futex(0x7f346ede802c, FUTEX_WAIT, 1, NULL

Strange, it must be stuck on the stat mutex, but I don't immediately see why that would happen. Does the attached patch make any difference for you, both in getting rid of the hang but still producing output at the desired intervals?

--
Jens Axboe

diff --git a/stat.c b/stat.c
index 979c8100d378..93316a239f7b 100644
--- a/stat.c
+++ b/stat.c
@@ -1466,11 +1466,12 @@ static void *__show_running_run_stats(void fio_unused *arg)
  * in the sig handler, but we should be disturbing the system less by just
  * creating a thread to do it.
  */
-void show_running_run_stats(void)
+int show_running_run_stats(void)
 {
 	pthread_t thread;
 
-	fio_mutex_down(stat_mutex);
+	if (fio_mutex_down_trylock(stat_mutex))
+		return 1;
 
 	if (!pthread_create(&thread, NULL, __show_running_run_stats, NULL)) {
 		int err;
@@ -1479,10 +1480,11 @@ void show_running_run_stats(void)
 		if (err)
 			log_err("fio: DU thread detach failed: %s\n", strerror(err));
 
-		return;
+		return 0;
 	}
 
 	fio_mutex_up(stat_mutex);
+	return 1;
 }
 
 static int status_interval_init;
@@ -1531,8 +1533,8 @@ void check_for_running_stats(void)
 			fio_gettime(&status_time, NULL);
 			status_interval_init = 1;
 		} else if (mtime_since_now(&status_time) >= status_interval) {
-			show_running_run_stats();
-			fio_gettime(&status_time, NULL);
+			if (!show_running_run_stats())
+				fio_gettime(&status_time, NULL);
 			return;
 		}
 	}
diff --git a/stat.h b/stat.h
index 2e46175053e8..82b8e973e4be 100644
--- a/stat.h
+++ b/stat.h
@@ -218,7 +218,7 @@ extern void show_group_stats(struct group_run_stats *rs);
 extern int calc_thread_status(struct jobs_eta *je, int force);
 extern void display_thread_status(struct jobs_eta *je);
 extern void show_run_stats(void);
-extern void show_running_run_stats(void);
+extern int show_running_run_stats(void);
 extern void check_for_running_stats(void);
 extern void sum_thread_stats(struct thread_stat *dst, struct thread_stat *src, int nr);
 extern void sum_group_stats(struct group_run_stats *dst, struct group_run_stats *src);

[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux