On 22/05/19 11:30AM, Jan Kara wrote: > On Thu 19-05-22 11:59:29, Ritesh Harjani wrote: > > On 22/05/19 11:13AM, Zhang Yi wrote: > > > On 2022/5/19 1:06, Ritesh Harjani wrote: > > > > On 22/05/18 10:10PM, Zhang Yi wrote: > > > >> We have already check the io_error and uptodate flag before submitting > > > >> the superblock buffer, and re-set the uptodate flag if it has been > > > >> failed to write out. But it was lockless and could be raced by another > > > >> ext4_commit_super(), and finally trigger '!uptodate' WARNING when > > > >> marking buffer dirty. Fix it by submit buffer directly. > > > > > > > > I agree that there could be a race with multiple processes trying to call > > > > ext4_commit_super(). Do you have a easy reproducer for this issue? > > > > > > > > > > Sorry, I don't have a easy reproducer, but we can always reproduce it through > > > inject delay and add filters into the ext4_commit_super(). > > ... > > > > > Also do you think something like below should fix the problem too? > > > > So if you lock the buffer from checking until marking the buffer dirty, that > > > > should avoid the race too that you are reporting. > > > > Thoughts? > > > > > > > > > > Thanks for your suggestion. I've thought about this solution and yes it's simpler > > > to fix the race, but I think we lock and unlock the sbh several times just for > > > calling standard buffer write helpers is not so good. Opencode the submit > > > procedure looks more clear to me. > > > > I agree your solution was cleaner since it does not has a lot of lock/unlock. > > My suggestion came in from looking at the history. > > This lock was added here [1] and I think it somehow got removed in this patch[2] > > > > [1]: https://lore.kernel.org/linux-ext4/1467285150-15977-2-git-send-email-pranjas@xxxxxxxxx/ > > [2]: https://lore.kernel.org/linux-ext4/20201216101844.22917-5-jack@xxxxxxx/ > > So the reason why I've move unlock_buffer() into ext4_update_super() was > mostly so that the function does not return with buffer lock (which is an > odd calling convention) when I was adding another user of it > (flush_stashed_error_work()). > > > Rather then solutions, I had few queries :) > > 1. What are the implications of not using > > mark_buffer_dirty()/__sync_dirty_buffer() > > Not much. Using submit_bh() directly is fine. Just the duplication of the > checks is somewhat unpleasant. Ok. > > > 2. In your solution one thing which I was not clear of, was whether we > > should call clear_buffer_dirty() before calling submit_bh(), in case if > > somehow(?) the state of the buffer was already marked dirty? Not sure how > > this can happen, but I see the logic in mark_buffer_dirty() which checks, > > if the buffer is already marked dirty, it simply returns. Then > > __sync_dirty_buffer() clears the buffer dirty state. > > It could happen e.g. if there was journalled update of the superblock > before. I guess calling clear_buffer_dirty() before submit_bh() does no > harm. Makes sense. > > Otherwise I like Yi's solution. I agree. Thanks for helping with the queries. -ritesh