[no subject]

Nick Neumann <nick@xxxxxxxxxxxxxxxx> · Sun, 28 Aug 2022 16:01:30 -0500

I've filed the issue on github, but just thought I'd mention here too.
In real-world use it appears to be intermittent. I"m not yet sure how
intermittent, but I could see it being used in production and not
caught right away. I got lucky and stumbled on it when looking at
graphs of runs and noticed 15 seconds of no activity.

https://github.com/axboe/fio/issues/1457

With the null ioengine, I can make it reproduce very reliably, which
is encouraging as I move to debug.

I had just moved to using log compression as it is really powerful,
and the only way to store per I/O logs for a long run without pushing
up against the amount of physical memory in a system.

(Without compression, a GB of sequential writes at 128K block size is
on the order of 245KB of memory per log, so a TB is 245MB per log. Now
run a job to fill a 20TB drive and you're at 4.9GB for one log file.
If you record all 3 latency numbers too, you're talking close to
20GB.)