On 09/13/2016 04:43 PM, Ben England wrote:
Am now running with Jens patch "ignore SEND_ETA, if we can't fin a reply command ". Thanks! I was trying to solve it by increasing timeout. I did the last commit on PR 241 to address the remaining comment. If it looks ok to you, I would appreciate this being pulled in because it is important to distributed fio --client testing in order to get bandwidth/iops logs and latency histogram logs. I have been testing it for about a week now with 900 workload generator threads across 7 hosts and it seems to work, especially now that assertion isn't firing because of above patch.
Glad that works!
An unrelated minor problem - There are still a couple of threads that have trouble returning a fio histogram to the test driver, will look into that, but at least fio --client is getting almost all the logs now with PR 241, enough that meaningful stats can be generated. Error messages are: fio: failed decompressing log fio: failed converting IO log This doesn't happen for smaller numbers of workload generator threads.
That usually points to a bug in the compression or decompression handling, so I'd probably look at that. We've had that in a few other areas, you might want to look at the changelog and see the fixes for those. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html