Hi Dave, Thanks for your response. I am not freezing the filesystem before the snapshot. However, let's assume that somebody resized the XFS, and it completed and got back to user-space. At this moment the primary superblock on-disk is not updated yet with the new agcount. And at this same moment there is a power-out. After the power comes back and the machine boots, if we mount the XFS, the same problem would happen, I believe. Because the primary superblock on-disk still has old agcount. So the in-memory pag structures will not be created for the new AGs during mount, but replaying the log might try to use them. Taking a block-level snapshot is exactly like a power-out from XFS perspective. And XFS should, in principle, be able to recover from that. The snapshot will come up as a new block device, which exhibits identical content as the original block device had at the moment when the snapshot was taken (like a boot after power-out). I will try to reproduce the problem by crashing the machine at the problematic moment, when the primary on-disk superblock still has the old value. Without the snapshot thing. Thanks, Alex. On Mon, Feb 22, 2016 at 11:20 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Mon, Feb 22, 2016 at 09:08:06PM +0200, Alex Lyakas wrote: >> Greetings XFS developers, >> >> I am seeing the following issue with XFS on kernel 3.18.19. >> >> When resizing, XFS adds new AGs and eventually updates the primary >> superblock with the new “sb_agcount” value. However, it happens few >> seconds after the resize operation completes back to user-space. As >> a result, if a block-level snapshot is taken off the underlying >> block device, while “sb_agcount” still has the old value, then >> subsequent XFS mount crashes with stack like[1]. > > The primary superblock change is logged, so it doesn't need to be > written back immediately. That means it is in the journal... > >> Some debugging shows that _xfs_buf_find is called with agno that has >> been added during the resize, but appropriate "pag" has not been >> created for this agno during mount. > > The new per-ag structures are created during growfs, after the > growfs transaction has committed. if you are mounting a snapshot > that has the wrong agcount in it, then lots of things will go wrong > if there is metadata that already uses the expanded space. > >> I have found the patch by Christoph Hellwig: >> http://oss.sgi.com/archives/xfs/2015-01/msg00391.html >> which sets the resize transaction to be synchronous, and applied it, >> but it still doesn’t help. >> >> Right after the resize completes, I am issuing: >> xfs_db -r -c "sb 0" -c "p" <device> >> and for a few seconds still get the old value of “sb_agcount”. >> >> Can anybody advise what am I missing? What needs to be done so that >> the primary superblock will get the new value of “sb_agount” >> promptly? > > Are you freezing the filesystem before taking a block level > snapshot? > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx<div id="DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table style="border-top: 1px solid #aaabb6;"> <tr> <td style="width: 470px; padding-top: 20px; color: #41424e; font-size: 13px; font-family: Arial, Helvetica, sans-serif; line-height: 18px;">This email has been sent from a virus-free computer protected by Avast. <br /><a href="https://www.avast.com/sig-email"; target="_blank" style="color: #4453ea;">www.avast.com</a> </td> </tr> </table><a href="#DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div> _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs