Hello,
As I know, there is no way to guarantee ordering between block writes inside a bio.
That is the reason why bio for journal commit block write and for other log block writes are separated in JBD2 module.
And, I think your idea can be optimized more efficiently.
If you write checksum for some data, ordering between checksum and data is not needed.
When the crash occurs, we just recalculate checksum with data and compare the recalculated one with a written one.
Even though checksum is written first, the recalculated checksum will be different with the written checksum because data is not written.
So, i think if you use checksum, ordering guaranteeing is not needed.
This is first time that i send mail to kernelnewbies mailing list.
If i did wrong thing on this mail, very sorry about that.
Thank you.
Joontaek Oh.
2020년 1월 28일 (화) 오전 3:23, Lukas Straub <lukasstraub2@xxxxxx>님이 작성:
On Mon, 27 Jan 2020 12:27:58 -0500
"Valdis Klētnieks" <valdis.kletnieks@xxxxxx> wrote:
> On Sun, 26 Jan 2020 13:07:38 +0100, Lukas Straub said:
>
> > I am planing to write a new device-mapper target and I'm wondering if there
> > is a ordering guarantee for the operation inside a single bio? For example if I
> > issue a write bio to sector 0 of length 4, is it guaranteed that sector 0 is
> > written first and sector 3 is written last?
>
> I'll bite. What are you doing where the order of writing out a single bio matters?
I plan to improve the performance of dm-integrity on HDDs by removing the requirement for bitmap or journal (which causes head seeks even for sequential writes). I also want to avoid cache flushes and FUA. The problem with dm-integrity is that the data and checksum update needs to be atomic.
So I came up with the following Idea:
The on-disk layout will look like this:
|csum_next-01|data-chunk-01|csum_prev-01|csum_next-02|data-chunk-02|csum_prev-02|...
Under normal conditions, csum_next-01 (a single sector) contains the checksums for data-chunk-01 and csum_prev-01 is a duplicate of csum_next-01.
Updating data will first update csum_next (with FUA), then update the data (FUA) and finally update csum_prev (FUA).
But if there is a ordering guarantee we have a fast path: If a full chunk of data is written, we simply issue a single big write with csum_next, data and csum_prev, all without FUA (except if the incoming request asks for that).
So that's why I'm asking.
_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
_______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies