On Wed, Mar 25, 2020 at 09:59:19AM +0000, Damien Le Moal wrote: > On 2020/03/25 18:48, hch@xxxxxxxxxxxxx wrote: > > On Wed, Mar 25, 2020 at 09:45:39AM +0000, Johannes Thumshirn wrote: > >> > >> Can you please elaborate on that? Why doesn't this hold true for a > >> normal file system? If we split the DIO write into multiple BIOs with > >> zone-append, there is nothing which guarantees the order of the written > >> data (at least as far as I can see). > > > > Of course nothing gurantees the order. But the whole point is that the > > order does not matter. > > > > The order does not matter at the DIO level since iomap dio end callback will > allow the FS to add an extent mapping the written data using the drive indicated > write location. But that callback is for the entire DIO, not per BIO fragment of > the DIO. So if the BIO fragments of a large DIO get reordered, as Johannes said, > we will get data corruption in the FS extent. No ? I thought of recording the location in ->iomap_end (and in fact had a prototype for that), but that is not going to work for AIO of course. So yes, we'll need some way to have per-extent completion callbacks.