Re: [PATCH v2 2/3] xfs: don't assert perag when free perag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 12, 2023 at 09:00:50AM +1100, Dave Chinner wrote:
> On Sat, Dec 09, 2023 at 08:21:06PM +0800, Long Li wrote:
> > When releasing the perag in xfs_free_perag(), the assertion that the
> > perag in readix tree is correct in most cases. However, there is one
> > corner case where the assertion is not true. During log recovery, the
> > AGs become visible(that is included in mp->m_sb.sb_agcount) first, and
> > then the perag is initialized. If the initialization of the perag fails,
> > the assertion will be triggered. Worse yet, null pointer dereferencing
> > can occur.
> 
> I'm going to assume that you are talking about xlog_do_recover()
> because the commit message doesn't actually tell us how this
> situation occurs.
> 
> That code re-reads the superblock, then copies it to mp->m_sb,
> then calls xfs_initialize_perag() with the values from mp->m_sb.
> 
> If log recovery replayed a growfs transaction, the mp->m_sb has a
> larger sb_agcount and so then xfs_initialize_perag() is called
> and if that fails we end up back in xfs_mountfs and the error
> stack calls xfs_free_perag().
> 
> Is that correct?

Yes, you are right. When I tried to fix the perag leak issue in patch 3,
I found this problem.

> 
> If so, then the fix is to change how xlog_do_recover() works. It
> needs to initialise the new perags before it updates the in-memory
> superblock. If xfs_initialize_perag() fails, it undoes all the
> changes it has made, so if we haven't updated the in-memory
> superblock when the init of the new perags fails then the error
> unwinding code works exactly as it should right now.
> 
> i.e. the bug is that xlog_do_recover() is leaving the in-memory
> state inconsistent on init failure, and we need to fix that rather
> than remove the assert that is telling us that in-memory state is
> inconsistent....
> 

Yes, agree with you, I used to think that removing the assertion
would solve the problem, but now it seems a bit lazy, the problem
should be solved at the source. Right now, I haven't figured out
how to fix this problem comprehensively, so I'll fix perag leak
issue first. 

Thanks,
Long Li





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux