On Mon, Nov 27, 2023 at 11:13:40PM -0800, Christoph Hellwig wrote: > > Every time I revisit this patch I wonder if DELWRI_Q is misnamed -- does > > it mean "b_list is active"? Or does it merely mean "another thread will > > write this buffer via b_list if nobody gets there first"? I think it's > > the second, since it's easy enough to list_empty(). > > I think it is misnamed and not clearly defined, and we should probably > fix that. Note that least the current list_empty() usage isn't quite > correct either without a lock held by the delwri list owner. We at > least need a list_empty_careful for a racey but still save check. > > Thinking out loud I wonder if we can just kill XBF_DELWRI_Q entirely. > Except for various asserts we mostly use it in reverse polarity, that is > to cancel delwri writeout for stale buffers. How about just skipping > XBF_STALE buffers on the delwri list directly and not bother with > DELWRI_Q? With that we can then support multiple delwri lists that > don't get into each others way using say an allocating xarray instead > of the invase list and avoid this whole mess. > > Let me try to prototype this.. Ok, I spent half the day prototyping this and it passes basic sanity checks: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/xfs-multiple-delwri-lists My conclusions from that is: 1) it works 2) I think it is the right to do 3) it needs a lot more work 4) we can't block the online repair work on it so I guess we'll need to go with the approach in this patch for now, maybe with a better commit log, and I'll look into finishing this work some time in the future.