https://bugzilla.kernel.org/show_bug.cgi?id=203947 Darrick J. Wong (djwong+kernel@xxxxxxxxxx) changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |djwong+kernel@xxxxxxxxxx --- Comment #2 from Darrick J. Wong (djwong+kernel@xxxxxxxxxx) --- Hmm... so we're clearly in a situation where we have ioend A -> ioend B and we're trying to merge A and B. A has a setfilesize transaction and B does not, but current code assumes that if A has one then B must have one and that it must cancel B's. Then we crash trying to cancel the transaction that B doesn't have. How do we end up in this situation? I can't trigger it on my systems, but I guess this sounds plausible: 1. Dirty pages 0, 1, and 2 of an empty file. 2. Writeback gets scheduled for pages 0 and 2, creating ioends A and C. Both ioends describe writes past the on-disk isize so we allocate transactions. 3. ioend C completes immediately, sets the ondisk isize to (3 * PAGESIZE). 4. Writeback gets scheduled for page 1, creating ioend B. ioend B describes a write within the on-disk isize so we do not allocate setfilesize transaction. 5. ioend A and B complete and are sorted into the per-inode ioend completion list. xfs_ioend_try_merge looks at ioend A, sees that ioend A has a setfilesize transaction and that there's an ioend B that can be merged with A. 6. _try_merge tries to call xfs_setfilesize_ioend(ioend B, -1) to cancel ioend B's transaction, but as we saw in (4), ioend B has no transaction and crashes. I wonder how hard it will be to write a regression test for this, since it requires fairly tight timing? Coincidentally, Christoph just posted "xfs: allow merging ioends over append boundaries" which I think fixes this problem. Zorro, can you apply it and retry? -- You are receiving this mail because: You are watching the assignee of the bug.