Folks, I just had a stack overflow in the delayed write buffer error handling with a shut down filesystem: ..... [ 20.712744] [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50 [ 20.712744] [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180 [ 20.712744] [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30 [ 20.712744] [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130 [ 20.712744] [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20 [ 20.712744] [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110 [ 20.712744] [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290 [ 20.712744] [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50 [ 20.712744] [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180 [ 20.712744] [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30 [ 20.712744] [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130 [ 20.712744] [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20 [ 20.712744] [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110 [ 20.712744] [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290 [ 20.712744] [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50 [ 20.712744] [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180 [ 20.712744] [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30 [ 20.712744] [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130 [ 20.712744] [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20 [ 20.712744] [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110 [ 20.712744] [<ffffffff814b9d7c>] xfs_buf_iodone_callbacks+0x11c/0x290 [ 20.712744] [<ffffffff81448023>] xfs_buf_iodone_work+0x23/0x50 [ 20.712744] [<ffffffff814481a0>] xfs_buf_ioend+0x70/0x180 [ 20.712744] [<ffffffff814484c5>] _xfs_buf_ioend+0x25/0x30 [ 20.712744] [<ffffffff81448788>] __xfs_buf_iorequest+0x98/0x130 [ 20.712744] [<ffffffff81448836>] xfs_buf_iorequest+0x16/0x20 [ 20.712744] [<ffffffff81448945>] xfs_bdstrat_cb+0x65/0x110 [ 20.712744] [<ffffffff81448c39>] __xfs_buf_delwri_submit+0x249/0x280 [ 20.712744] [<ffffffff81449920>] xfs_buf_delwri_submit_nowait+0x20/0x30 [ 20.712744] [<ffffffff814bc43e>] xfsaild+0x21e/0x750 [ 20.712744] [<ffffffff810a0472>] kthread+0xa2/0xb0 [ 20.712744] [<ffffffff81b83c64>] kernel_thread_helper+0x4/0x10 Basically, the commit: 43ff212 xfs: on-stack delayed write buffer lists took away the delay in resubmitting metadata buffers that have had a write error, and so the xfsbdstrat() resubmission immediately errors out on the shutdown flag, calling the io completion for teh buffer that then runs xfs_buf_iodone_callbacks(), that then calls xfs_bdstrat_cb(), that then errors out on the shutdown flag, calls io completion, and around it goes in a spiral of death. I did flag the change to an immediate xfsbdstrat() call as a problem in review, and mentioned a possible solution to the problem, but it looks like it fell through the cracks http://oss.sgi.com/archives/xfs/2012-04/msg00760.html "This will just resubmit the IO immediately after it is failed, while previously it will only be pushed again after it ages out (15s later). Perhaps it can just be left to be pushed by the aild next time it passes over it?" That would definitely prevent the Spiral of Stack Doom that I've just seen.... I don't have time to come up with a fix for this right now, but it needs to be fixed before 3.5 releases. I don't have time because I'm going to be AFK next week, so I'd appreciate it if someone could look at fixing this in the mean time? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs