RE: fio server/client disconnect bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here's the failing case where host1 does not print:

back=1, groupid=0, nr_ts=1
client: got cmd op PROBE from host2 (pdu=112)
client host2 removed 2
client: got cmd op PROBE from host1 (pdu=112)
client host1 removed 2
client: got cmd op START from host2 (pdu=8)
client host2 removed 2
client: got cmd op START from host1 (pdu=8)
client host1 removed 2
client: got cmd op ADD_JOB from host2 (pdu=23204)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=70)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=90)
client host2 removed 2
client: got cmd op ADD_JOB from host1 (pdu=23204)
client host1 removed 2
client: got cmd op TEXT from host2 (pdu=51)
client host2 removed 2
client: got cmd op TEXT from host1 (pdu=70)
client host1 removed 2
client: got cmd op START from host2 (pdu=8)
client host2 removed 2
client: got cmd op TEXT from host1 (pdu=90)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=51)
client host1 removed 2
client: got cmd op START from host1 (pdu=8)
client host1 removed 2
client: got cmd op TEXT from host2 (pdu=33)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=33)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=25)
client host2 removed 2
client: got cmd op TEXT from host1 (pdu=33)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=33)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=25)
client host1 removed 2
client: got cmd op SERVER_START from host2 (pdu=0)
client host2 removed 2
client: got cmd op SERVER_START from host1 (pdu=0)
client host1 removed 2
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 2
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 2
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 25.0%][r=2667M,w=0k][r=41.7k,w=0 IOPS][eta 00m:02s]
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 28.9%][r=2661M,w=0k][r=41.6k,w=0 IOPS][eta 00m:01s]
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 200.0%][r=2669M,w=0k][r=41.7k,w=0 IOPS][eta 00m:00s]
client: got cmd op IOLOG from host2 (pdu=338)
client host2 removed 2
client: got cmd op IOLOG from host2 (pdu=327)
client host2 removed 2
client: got cmd op IOLOG from host2 (pdu=337)
client host2 removed 2
client: got cmd op IOLOG from host1 (pdu=338)
client host1 removed 2
client: got cmd op IOLOG from host1 (pdu=325)
client host1 removed 2
client: got cmd op IOLOG from host1 (pdu=337)
client host1 removed 2
client: got cmd op IOLOG from host1 (pdu=323)
client host1 removed 2
client: got cmd op IOLOG from host1 (pdu=320)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op STOP from host1 (pdu=8)
client host1 removed 2
client: got cmd op QUIT from host1 (pdu=0)
client host1 removed 2
client host1 removed 1
client: got cmd op IOLOG from host2 (pdu=331)
client host2 removed 2
client: got cmd op IOLOG from host2 (pdu=327)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op TS from host2 (pdu=94888)
client host2 removed 2
client: got cmd op GS from host2 (pdu=164)
client host2 removed 2
client: got cmd op DISK_UTIL from host2 (pdu=256)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=25)
client host2 removed 2
client: got cmd op STOP from host2 (pdu=8)
client host2 removed 2
client: got cmd op QUIT from host2 (pdu=0)
client host2 removed 2
client host2 removed 1

And here's a good case:

client: got cmd op PROBE from host2 (pdu=112)
client host2 removed 2
client: got cmd op PROBE from host1 (pdu=112)
client host1 removed 2
client: got cmd op START from host2 (pdu=8)
client host2 removed 2
client: got cmd op START from host1 (pdu=8)
client host1 removed 2
client: got cmd op ADD_JOB from host2 (pdu=23204)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=70)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=90)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=51)
client host2 removed 2
client: got cmd op START from host2 (pdu=8)
client host2 removed 2
client: got cmd op ADD_JOB from host1 (pdu=23204)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=70)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=90)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=51)
client host1 removed 2
client: got cmd op START from host1 (pdu=8)
client host1 removed 2
client: got cmd op TEXT from host2 (pdu=33)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=33)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=25)
client host2 removed 2
client: got cmd op SERVER_START from host2 (pdu=0)
client host2 removed 2
client: got cmd op TEXT from host1 (pdu=33)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=33)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=25)
client host1 removed 2
client: got cmd op SERVER_START from host1 (pdu=0)
client host1 removed 2
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 2
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 2
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 25.0%][r=2667M,w=0k][r=41.7k,w=0 IOPS][eta 00m:02s]
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 28.9%][r=2669M,w=0k][r=41.7k,w=0 IOPS][eta 00m:01s]
client: got cmd op ETA from host2 (pdu=165)
client host2 removed 2
client: got cmd op ETA from host1 (pdu=165)
client host1 removed 200.0%][r=2675M,w=0k][r=41.8k,w=0 IOPS][eta 00m:00s]
client: got cmd op IOLOG from host2 (pdu=336)
client host2 removed 2
client: got cmd op IOLOG from host2 (pdu=327)
client host2 removed 2
client: got cmd op IOLOG from host2 (pdu=337)
client host2 removed 2
client: got cmd op IOLOG from host1 (pdu=338)
client host1 removed 2
client: got cmd op IOLOG from host1 (pdu=327)
client host1 removed 2
client: got cmd op IOLOG from host1 (pdu=338)
client host1 removed 2
client: got cmd op IOLOG from host2 (pdu=330)
client host2 removed 2
client: got cmd op IOLOG from host2 (pdu=327)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op JOB_OPT from host2 (pdu=200)
client host2 removed 2
client: got cmd op TS from host2 (pdu=94888)
client host2 removed 2
client: got cmd op GS from host2 (pdu=164)
client host2 removed 2
client: got cmd op DISK_UTIL from host2 (pdu=256)
client host2 removed 2
client: got cmd op TEXT from host2 (pdu=25)
client host2 removed 2
client: got cmd op STOP from host2 (pdu=8)
client host2 removed 2
client: got cmd op QUIT from host2 (pdu=0)
client host2 removed 2
client host2 removed 1
client: got cmd op IOLOG from host1 (pdu=330)
client host1 removed 2
client: got cmd op IOLOG from host1 (pdu=327)
back=1, groupid=0, nr_ts=1
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op JOB_OPT from host1 (pdu=200)
client host1 removed 2
client: got cmd op TS from host1 (pdu=94888)
client host1 removed 2
client: got cmd op GS from host1 (pdu=164)
client host1 removed 2
client: got cmd op DISK_UTIL from host1 (pdu=256)
client host1 removed 2
client: got cmd op TEXT from host1 (pdu=25)
client host1 removed 2
client: got cmd op STOP from host1 (pdu=8)
client host1 removed 2
client: got cmd op QUIT from host1 (pdu=0)
client host1 removed 2
client host1 removed 1

Regards,
Jeff


-----Original Message-----
From: Jens Axboe [mailto:axboe@xxxxxxxxx] 
Sent: Wednesday, March 21, 2018 10:12 AM
To: Jeff Furlong <jeff.furlong@xxxxxxx>; fio@xxxxxxxxxxxxxxx
Subject: Re: fio server/client disconnect bug

On 3/21/18 10:58 AM, Jeff Furlong wrote:
> Patch applied to latest git.  Failed after 26 iterations.  The host1 did not print output to the summary file.
> 
> client: got cmd op PROBE from host2 (pdu=112)
> client: got cmd op PROBE from host1 (pdu=112)
> client: got cmd op START from host2 (pdu=8)
> client: got cmd op START from host1 (pdu=8)
> client: got cmd op ADD_JOB from host2 (pdu=23204)
> client: got cmd op TEXT from host2 (pdu=70)
> client: got cmd op TEXT from host2 (pdu=90)
> client: got cmd op TEXT from host2 (pdu=51)
> client: got cmd op ADD_JOB from host1 (pdu=23204)
> client: got cmd op START from host2 (pdu=8)
> client: got cmd op TEXT from host1 (pdu=70)
> client: got cmd op TEXT from host1 (pdu=90)
> client: got cmd op TEXT from host1 (pdu=51)
> client: got cmd op START from host1 (pdu=8)
> client: got cmd op TEXT from host2 (pdu=33)
> client: got cmd op TEXT from host2 (pdu=33)
> client: got cmd op TEXT from host2 (pdu=25)
> client: got cmd op TEXT from host1 (pdu=33)
> client: got cmd op TEXT from host1 (pdu=33)
> client: got cmd op TEXT from host1 (pdu=25)
> client: got cmd op SERVER_START from host2 (pdu=0)
> client: got cmd op SERVER_START from host1 (pdu=0)
> client: got cmd op ETA from host2 (pdu=165)
> client: got cmd op ETA from host1 (pdu=165)
> client: got cmd op ETA from host2 (pdu=165)
> client: got cmd op ETA from host1 (pdu=165)
> client: got cmd op ETA from host2 (pdu=165)
> client: got cmd op ETA from host1 (pdu=165)
> client: got cmd op ETA from host2 (pdu=165)=41.6k,w=0 IOPS][eta 
> 00m:02s]
> client: got cmd op ETA from host1 (pdu=165)
> client: got cmd op ETA from host2 (pdu=165)=41.6k,w=0 IOPS][eta 
> 00m:01s]
> client: got cmd op ETA from host1 (pdu=165)
> client: got cmd op IOLOG from host2 (pdu=338)41.6k,w=0 IOPS][eta 
> 00m:00s]
> client: got cmd op IOLOG from host2 (pdu=327)
> client: got cmd op IOLOG from host2 (pdu=338)
> client: got cmd op IOLOG from host1 (pdu=337)
> client: got cmd op IOLOG from host1 (pdu=328)
> client: got cmd op IOLOG from host1 (pdu=336)
> client: got cmd op IOLOG from host2 (pdu=330)
> client: got cmd op IOLOG from host2 (pdu=327)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op JOB_OPT from host2 (pdu=200)
> client: got cmd op TS from host2 (pdu=94888)
> client: got cmd op GS from host2 (pdu=164)
> client: got cmd op IOLOG from host1 (pdu=324)
> client: got cmd op DISK_UTIL from host2 (pdu=256)
> client: got cmd op IOLOG from host1 (pdu=320)
> client: got cmd op TEXT from host2 (pdu=25)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op STOP from host2 (pdu=8)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op QUIT from host2 (pdu=0)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op JOB_OPT from host1 (pdu=200)
> client: got cmd op STOP from host1 (pdu=8)
> client: got cmd op QUIT from host1 (pdu=0)

So this is weird, after the JOB_OPT it should have send the gs/ts and so on summary. Which is what triggers the stat display. Can you remove the previous patch, and add this debug patch with some more printf goodness in it?


diff --git a/client.c b/client.c
index bff0adc0d972..0f740b5f3057 100644
--- a/client.c
+++ b/client.c
@@ -179,6 +179,8 @@ static struct fio_client *find_client_by_fd(int fd)
 
 void fio_put_client(struct fio_client *client)  {
+	printf("client %s removed %d\n", client->hostname, client->refs);
+
 	if (--client->refs)
 		return;
 
@@ -1664,7 +1666,7 @@ int fio_handle_client(struct fio_client *client)
 	if (!cmd)
 		return 0;
 
-	dprint(FD_NET, "client: got cmd op %s from %s (pdu=%u)\n",
+	printf("client: got cmd op %s from %s (pdu=%u)\n",
 		fio_server_op(cmd->opcode), client->hostname, cmd->pdu_len);
 
 	client->last_cmd = cmd->opcode;
diff --git a/stat.c b/stat.c
index 98ab63893900..c23a3c9ddabc 100644
--- a/stat.c
+++ b/stat.c
@@ -1882,6 +1882,8 @@ void __show_run_stats(void)
 		json_object_add_value_array(root, "jobs", array);
 	}
 
+	printf("back=%d, groupid=%d, nr_ts=%d\n", is_backend, groupid, nr_ts);
+
 	if (is_backend)
 		fio_server_send_job_options(&get_global_options()->opt_list, -1U);
 

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux