On Sat, Dec 09, 2023 at 08:21:06PM +0800, Long Li wrote: > When releasing the perag in xfs_free_perag(), the assertion that the > perag in readix tree is correct in most cases. However, there is one > corner case where the assertion is not true. During log recovery, the > AGs become visible(that is included in mp->m_sb.sb_agcount) first, and > then the perag is initialized. If the initialization of the perag fails, > the assertion will be triggered. Worse yet, null pointer dereferencing > can occur. I'm going to assume that you are talking about xlog_do_recover() because the commit message doesn't actually tell us how this situation occurs. That code re-reads the superblock, then copies it to mp->m_sb, then calls xfs_initialize_perag() with the values from mp->m_sb. If log recovery replayed a growfs transaction, the mp->m_sb has a larger sb_agcount and so then xfs_initialize_perag() is called and if that fails we end up back in xfs_mountfs and the error stack calls xfs_free_perag(). Is that correct? If so, then the fix is to change how xlog_do_recover() works. It needs to initialise the new perags before it updates the in-memory superblock. If xfs_initialize_perag() fails, it undoes all the changes it has made, so if we haven't updated the in-memory superblock when the init of the new perags fails then the error unwinding code works exactly as it should right now. i.e. the bug is that xlog_do_recover() is leaving the in-memory state inconsistent on init failure, and we need to fix that rather than remove the assert that is telling us that in-memory state is inconsistent.... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx