On Thu 19-05-22 11:59:29, Ritesh Harjani wrote: > On 22/05/19 11:13AM, Zhang Yi wrote: > > On 2022/5/19 1:06, Ritesh Harjani wrote: > > > On 22/05/18 10:10PM, Zhang Yi wrote: > > >> We have already check the io_error and uptodate flag before submitting > > >> the superblock buffer, and re-set the uptodate flag if it has been > > >> failed to write out. But it was lockless and could be raced by another > > >> ext4_commit_super(), and finally trigger '!uptodate' WARNING when > > >> marking buffer dirty. Fix it by submit buffer directly. > > > > > > I agree that there could be a race with multiple processes trying to call > > > ext4_commit_super(). Do you have a easy reproducer for this issue? > > > > > > > Sorry, I don't have a easy reproducer, but we can always reproduce it through > > inject delay and add filters into the ext4_commit_super(). ... > > > Also do you think something like below should fix the problem too? > > > So if you lock the buffer from checking until marking the buffer dirty, that > > > should avoid the race too that you are reporting. > > > Thoughts? > > > > > > > Thanks for your suggestion. I've thought about this solution and yes it's simpler > > to fix the race, but I think we lock and unlock the sbh several times just for > > calling standard buffer write helpers is not so good. Opencode the submit > > procedure looks more clear to me. > > I agree your solution was cleaner since it does not has a lot of lock/unlock. > My suggestion came in from looking at the history. > This lock was added here [1] and I think it somehow got removed in this patch[2] > > [1]: https://lore.kernel.org/linux-ext4/1467285150-15977-2-git-send-email-pranjas@xxxxxxxxx/ > [2]: https://lore.kernel.org/linux-ext4/20201216101844.22917-5-jack@xxxxxxx/ So the reason why I've move unlock_buffer() into ext4_update_super() was mostly so that the function does not return with buffer lock (which is an odd calling convention) when I was adding another user of it (flush_stashed_error_work()). > Rather then solutions, I had few queries :) > 1. What are the implications of not using > mark_buffer_dirty()/__sync_dirty_buffer() Not much. Using submit_bh() directly is fine. Just the duplication of the checks is somewhat unpleasant. > 2. In your solution one thing which I was not clear of, was whether we > should call clear_buffer_dirty() before calling submit_bh(), in case if > somehow(?) the state of the buffer was already marked dirty? Not sure how > this can happen, but I see the logic in mark_buffer_dirty() which checks, > if the buffer is already marked dirty, it simply returns. Then > __sync_dirty_buffer() clears the buffer dirty state. It could happen e.g. if there was journalled update of the superblock before. I guess calling clear_buffer_dirty() before submit_bh() does no harm. Otherwise I like Yi's solution. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR