On Mon, Mar 31, 2014 at 09:22:19PM -0500, Mark Tinguely wrote: > >Well, it's been over a week now and you're asking me to trust that > >someone I don't know and who has never submitted an xfstests before > >to do something in a timely manner so we can test a critical bug fix > >during a merge window. I'm willing to be pleasently surprised, but > >history tells me that people that report bugs rarely follow up with > >xfstest cases and it's usually the developer that fixes the bug that > >generates the xfstests patch. > > > >So if the xfstests patch doesn't arrive in the next few hours, can > >you please do that for us so I can get this sorted out for the merge > >window? > > > >Cheers, > > > >Dave. > > Dave, > > I think we need to take a step back and clear a little confusion here. > There are 2 different directory bugs. > > 1) Freeing of a already free extent. It presents with the error: > XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 16XX of file > fs/xfs/xfs_alloc.c. > Could be a right or a left edge (or both) that is free. > > Morgan Meyers <Morgan.Mears@xxxxxxxxxx> sent the latest occurrence on > March 12, but others have been seeing it in the community code in the > last few mounts. SGI has been seeing it lately with big customers and > it has occurred off and on for 7-8 years according to our bug > database. I fail to see what this has to do with someone providing an xfstests case for the directory hash regression that was under discussion. Regardless, I'll take issue with your sweeping generalisation: not every XFS_WANT_CORRUPTED_GOTO error has the same cause. Indeed, most of the ones we've seen in the past 7-8 years we've found some kind of problem with hardware or fixed other bugs that have made it go away. The above issue that was reported is - so far - a one of a kind. I haven't seen any other reports that are even vaguely similar. If SGI has more customers hitting this problem, then it would be really nice if SGI could provide that information to the community rather than complain that they've been seeing it for 8 years. All that tells us in the community is that you aren't fixing bugs your customers are hitting and youren't passing them on to people who might be able to help... IOWs, if a vendor doesn't have the expertise to find the underlying problem and they need help tracking down such problems, then they should report the bugs to the list like end users do. > 2) Hannes Frederic Sowa found a different directory bug on Thursday, > March 27. He included a replicator. I bisected the source of the this > bug on Thursday. Walked the bisected patch on Friday and posted the > patch. The idea to make a xfstest from the replicator was also made > on March 28. > > This bug has been only known for 3 business days. I already promised > that a xfstest will be made. If you need to verify the problem and > the patch, there already is a replicator. The xfstest is *not for me* - it's for every distro and vendor out there that ships XFS in their product to realise that there's a serious bug they need fixing, and for them to be able to confirm that they've fixed it. I don't ask people to do stuff for my benefit - I'm perfectly capable of doing random special stuff for myself - but I will ask for things that are needed for the greater community. That's why I asked you to rewrite the commit message to explain what the cause and impact of problem being fixed was, and why I'm asking for the regression test to be provided quickly. Both of these things greatly benefit downstream users of XFS and xfstests, so upstream processes need to reflect this. Fixing the bug in the upstream tree is only half the job we need to do... It's a moot discussion now that the xfstest case has been posted.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs