Re: XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 990 of file fs/xfs/xfs_ialloc.c

Brian Foster <bfoster@xxxxxxxxxx> · Wed, 18 Feb 2015 10:40:27 -0500

On Wed, Feb 18, 2015 at 12:19:03PM -0300, Pablo Silva wrote:
> The XFS filesystem is only access by kvm guest, the objetive is store the
> backup's amanda that running in kvm host.
> 

Ok, so you've got an lv on the CentOS 6 host that is mapped through to
the CentOS 7 guest. It's not really clear to me whether you've
formatted that again with lvm inside the guest, or the raw drive is
mapped through, but anyways...

If the fs is only accessed by the guest (CentOS 7), why does the error
report occur on the CentOS 6 kernel?

Feb 12 19:22:15 vtl kernel: Pid: 3502, comm: touch Not tainted 2.6.32-431.17.1.el6.x86_64

Brian

> Now, i answer your questions:
> 
> * Have there been any other storage errors reported in the logs?,
> 
> ans: No, no information inside /var/log/messages , dmesg about xfs., can
> you suggest other log file for to find any information?
> 
> * Is the problem reproducible or was it a one off occurrence?
> 
> ans: It is only the first occurrence, and the server has a little use,
> because we are working yet for to complete the amanda backups, it isn't in
> production yet.
> 
> -Pablo
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Wed, Feb 18, 2015 at 11:58 AM, Brian Foster <bfoster@xxxxxxxxxx> wrote:
> 
> > On Wed, Feb 18, 2015 at 11:13:23AM -0300, Pablo Silva wrote:
> > > Hi Brian!
> > >
> > >  Thanks for your time!, i've just to run xfs_repair -n the result you can
> > > see --> http://pastebin.centos.org/16106/ and as you can see there was
> > > problems, but why this problems occurs?, we have xfs partition used for
> > one
> > > KVM where running Amanda with Centos 7, perhaps it was a bad
> > configuration
> > > inside KVM or perhaps a bug?, the host server is running centos 6
> > >
> >
> > Can you elaborate? You have an isolated XFS filesystem that is accessed
> > by the host system, the kvm guest, or both? How is the fs used?
> >
> > Also, can you answer my other questions related to how this occurred?
> > Are there any other errors in the logs that precede the one below?
> >
> > I assume the fs shut down at the point of the error below. Has it been
> > mounted since? If not, that might be worth a try to replay the log. If
> > the log is dirty and replayed, that should be indicated by the output in
> > the log at mount time. You'll also want to re-run xfs_repair -n in that
> > case.
> >
> > Brian
> >
> > > [root@vtl ~]# uname -a
> > > Linux vtl.areaprod.b2b 2.6.32-504.8.1.el6.x86_64 #1 SMP Wed Jan 28
> > 21:11:36
> > > UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> > >
> > >
> > > But the KVM running centos 7
> > >
> > > Linux amanda 3.10.0-123.8.1.el7.x86_64 #1 SMP Mon Sep 22 19:06:58 UTC
> > 2014
> > > x86_64 x86_64 x86_64 GNU/Linux
> > >
> > > The KVM config is this --> http://pastebin.centos.org/16111/
> > >
> > > And /etc/fstab is this---> http://pastebin.centos.org/16116/
> > >
> > >
> > > Thanks in advance, for any hint.
> > >
> > > -Pablo
> > >
> > >
> > > On Mon, Feb 16, 2015 at 11:10 AM, Brian Foster <bfoster@xxxxxxxxxx>
> > wrote:
> > >
> > > > On Fri, Feb 13, 2015 at 03:44:57PM -0300, Pablo Silva wrote:
> > > > > Hi !
> > > > >
> > > > >     We have a server with centos 6.6, kernel version:
> > > > > 2.6.32-431.17.1.el6.x86_64, where we got the following message:
> > > > >
> > > > > Feb 12 19:22:15 vtl kernel:
> > > > > Feb 12 19:22:15 vtl kernel: Pid: 3502, comm: touch Not tainted
> > > > > 2.6.32-431.17.1.el6.x86_64 #1
> > > > > Feb 12 19:22:15 vtl kernel: Call Trace:
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa041ae5f>] ?
> > > > > xfs_error_report+0x3f/0x50 [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa0422980>] ?
> > xfs_ialloc+0x60/0x6e0
> > > > [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa041ec2e>] ?
> > > > xfs_dialloc+0x43e/0x850 [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa0422980>] ?
> > xfs_ialloc+0x60/0x6e0
> > > > [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa044007a>] ?
> > > > > kmem_zone_zalloc+0x3a/0x50 [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa043b814>] ?
> > > > > xfs_dir_ialloc+0x74/0x2b0 [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa043d900>] ?
> > > > xfs_create+0x440/0x640 [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa044aa5d>] ?
> > > > xfs_vn_mknod+0xad/0x1c0 [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffffa044aba0>] ?
> > > > xfs_vn_create+0x10/0x20 [xfs]
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffff81198086>] ?
> > vfs_create+0xe6/0x110
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffff8119bb9e>] ?
> > > > do_filp_open+0xa8e/0xd20
> > > > > Feb 12 19:22:15 vtl kernel: [<ffffffff811a7ea2>] ?
> > alloc_fd+0x92/0x160
> > > > > Feb 12 19:22:15 vtl kernel: XFS: Internal error
> > > > > XFS_WANT_CORRUPTED_GOTO at line 990 of file fs/xfs/xfs_ialloc.c.
> > > > > Caller 0xffffffffa0422980
> > > > >
> > > >
> > > >         /*
> > > >          * None left in the last group, search the whole AG
> > > >          */
> > > >         error = xfs_inobt_lookup(cur, 0, XFS_LOOKUP_GE, &i);
> > > >         if (error)
> > > >                 goto error0;
> > > >         XFS_WANT_CORRUPTED_GOTO(i == 1, error0);
> > > >
> > > >         for (;;) {
> > > >                 error = xfs_inobt_get_rec(cur, &rec, &i);
> > > >                 if (error)
> > > >                         goto error0;
> > > >                 XFS_WANT_CORRUPTED_GOTO(i == 1, error0);
> > > >                 if (rec.ir_freecount > 0)
> > > >                         break;
> > > >                 error = xfs_btree_increment(cur, 0, &i);
> > > >                 if (error)
> > > >                         goto error0;
> > > > --->            XFS_WANT_CORRUPTED_GOTO(i == 1, error0);
> > > >         }
> > > >
> > > > That corresponds to the check above. This code is part of the inode
> > > > allocator where we expect an AG to have free inodes and we're doing a
> > > > brute force search for a record. Apparently we go off the AG or some
> > > > other problem occurs before we find a free inode record.
> > > >
> > > > Does 'xfs_repair -n' report any problems with this fs? Have there been
> > > > any other storage errors reported in the logs? Is the problem
> > > > reproducible or was it a one off occurrence?
> > > >
> > > > Brian
> > > >
> > > > > I can't find more information for this..., perhaps a bug or other
> > > > > thing ..., welcome any hint for to research..
> > > > >
> > > > > Thanks in advance!
> > > > >
> > > > > -Pablo
> > > >
> > > > > _______________________________________________
> > > > > xfs mailing list
> > > > > xfs@xxxxxxxxxxx
> > > > > http://oss.sgi.com/mailman/listinfo/xfs
> > > >
> > > >
> >

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs