On 03/31/14 16:40, Dave Chinner wrote:
On Mon, Mar 31, 2014 at 11:42:46AM -0500, Mark Tinguely wrote:
On 03/30/14 19:10, Dave Chinner wrote:
On Fri, Mar 28, 2014 at 12:33:34PM -0500, Mark Tinguely wrote:
Fix the fix directory "bad hash ordering" bug introduced in
commit f5ea1100.
...
---
A C program that generates this problem can be found at:
http://oss.sgi.com/archives/xfs/2014-03/msg00373.html
A xfstest for this bug is coming from Hannes Frederic Sowa.
Can you convert this program to an xfstest yourself so that I can
commit the regression test at the same time I commit an updated
fix?
We narrowed the iterations down to make it a quick test.
I have every confidence that Hannes can generate the test in a timely
manner and I will help in any way possible.
Well, it's been over a week now and you're asking me to trust that
someone I don't know and who has never submitted an xfstests before
to do something in a timely manner so we can test a critical bug fix
during a merge window. I'm willing to be pleasently surprised, but
history tells me that people that report bugs rarely follow up with
xfstest cases and it's usually the developer that fixes the bug that
generates the xfstests patch.
So if the xfstests patch doesn't arrive in the next few hours, can
you please do that for us so I can get this sorted out for the merge
window?
Cheers,
Dave.
Dave,
I think we need to take a step back and clear a little confusion here.
There are 2 different directory bugs.
1) Freeing of a already free extent. It presents with the error:
XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 16XX of file
fs/xfs/xfs_alloc.c.
Could be a right or a left edge (or both) that is free.
Morgan Meyers <Morgan.Mears@xxxxxxxxxx> sent the latest occurrence on
March 12, but others have been seeing it in the community code in the
last few mounts. SGI has been seeing it lately with big customers and
it has occurred off and on for 7-8 years according to our bug
database.
It is a nasty bug that can can cause corruption. As I mentioned last
week in the analysis of Morgan's metadata dump, XFS can allocate the
same buffer multiple times. In his metadata dump there is a directory
block and inode clusters that also allocated as user blocks. These
duplicate allocated blocks are land mines waiting to go off either
when written to by one owner or when when both allocations are
removed which causes the XFS_WANT_CORRUPTED_GOTO forced shutdown.
2) Hannes Frederic Sowa found a different directory bug on Thursday,
March 27. He included a replicator. I bisected the source of the this
bug on Thursday. Walked the bisected patch on Friday and posted the
patch. The idea to make a xfstest from the replicator was also made
on March 28.
This bug has been only known for 3 business days. I already promised
that a xfstest will be made. If you need to verify the problem and
the patch, there already is a replicator.
--Mark.
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs