FYI, I still think this whole limit is a bad idea, and instead we need to fix the algorithms to not waste time iterating over the bios again and again in non-optimal ways. I still haven't seen an answer if your original latency problems are reproducable with the iomap direct I/O implementation, and we've also not even started the work on making some of the common iterators less stupid.