PR for this is up now. On Sun, Aug 28, 2022 at 4:01 PM Nick Neumann <nick@xxxxxxxxxxxxxxxx> wrote: > > I've filed the issue on github, but just thought I'd mention here too. > In real-world use it appears to be intermittent. I"m not yet sure how > intermittent, but I could see it being used in production and not > caught right away. I got lucky and stumbled on it when looking at > graphs of runs and noticed 15 seconds of no activity. > > https://github.com/axboe/fio/issues/1457 > > With the null ioengine, I can make it reproduce very reliably, which > is encouraging as I move to debug. > > I had just moved to using log compression as it is really powerful, > and the only way to store per I/O logs for a long run without pushing > up against the amount of physical memory in a system. > > (Without compression, a GB of sequential writes at 128K block size is > on the order of 245KB of memory per log, so a TB is 245MB per log. Now > run a job to fill a 20TB drive and you're at 4.9GB for one log file. > If you record all 3 latency numbers too, you're talking close to > 20GB.)