On 3/27/23 16:43, Christoph Hellwig wrote:
On Mon, Mar 27, 2023 at 02:06:09PM -0700, Bart Van Assche wrote:
Hence, the number of extents
for large files increases and performance when reading large files reduces.
To me comparing the performance of these two approaches sounds like a good
topic for a research paper. I'm not sure that REQ_OP_ZONE_APPEND is better
for all zoned storage workloads than REQ_OP_WRITE.
For REQ_OP_WRITE you absolutely must avoid reordering, so you need to
globally serialize. If you can come up with a workload where your write
based approach is fast, please show it!
When using REQ_OP_ZONE_APPEND, a global lock or atomics are necessary in
the space allocator to prevent attempts to write more data into a zone
than what fits into a zone.
When using REQ_OP_WRITE, only serialization of the code that assigns
LBAs is required. The writes themselves do not have to be serialized if
these won't be reordered.
My point is that it is nontrivial to compare filesystem designs based on
REQ_OP_WRITE versus REQ_OP_ZONE_APPEND and hence that zoned writes
implemented with REQ_OP_WRITE should remain supported.
Bart.