Re: [BUG] XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 272

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
Test environment consist two machines target and initiator.
First machine works as target with QLogic Corp. ISP2432-based 4Gb
Fibre Channel device. Storage is placed on two KINGSTON SNV425S SSD
working as RAID0 array. RAID is managed by LSI MegaRAID SAS 1068
controller.
Second machine works as initiator with the same QLogic card.

After few days of running test script I got following call trace and
XFS stopped working.

Can you narrow this down from "takes several days" to the simplest
possible reproducer? It happened due to IO errors during mount, so
maybe you can did that part out of your script and give us a test
case that reproduces on the first mount?


I 'll try. This errors occurs only on heavy load.
Is there any possibility to simulate I/O errors on XFS filesystem?
You can use something like a dm-flakey or md-faulty block devices perhaps.
Thank you very much. Md-faulty helps to reproduce this problem. I can repeat this problem after a few mounts.
Simple test script source for reproducing this problem:

#!/bin/bash

mdadm --create -l faulty /dev/md0 -n1 /dev/sda
parted /dev/md0 --script mklabel msdos
parted /dev/md0 --script mkpart primary 1 100

mkfs -t xfs -f -l lazy-count=0 /dev/md0p1

mdadm -G /dev/md0 -l faulty --layout=rt40
mkdir /mnt/test

while true; do

mount -t xfs -o defaults,usrquota,grpquota,nouuid,noatime,nodiratime /dev/md0p1 /mnt/test
sleep 1
umount /dev/md0p1

done

And a call trace:

[  994.403980] XFS (md0p1): Mounting Filesystem
[  994.439392] XFS (md0p1): Ending clean mount
[  994.439805] XFS (md0p1): Quotacheck needed: Please wait.
[  994.453523] XFS (md0p1): Quotacheck: Done.
[  995.494715] XFS (md0p1): Mounting Filesystem
[ 995.495310] XFS (md0p1): metadata I/O error: block 0x1905f ("xlog_bread_noalign") error 5 buf count 512
[  995.557848] XFS (md0p1): empty log check failed
[  995.557851] XFS (md0p1): log mount/recovery failed: error 5
[  995.558284] XFS (md0p1): log mount failed
[  996.585465] XFS (md0p1): Mounting Filesystem
[  996.604007] XFS (md0p1): Ending clean mount
[  997.633585] XFS (md0p1): last sector read failed
[  998.642557] XFS (md0p1): Mounting Filesystem
[ 998.660927] XFS (md0p1): metadata I/O error: block 0x60 ("xfs_trans_read_buf") error 5 buf count 4096
[  998.723018] XFS (md0p1): Ending clean mount
[  999.751782] XFS (md0p1): Mounting Filesystem
[ 999.779665] XFS (md0p1): metadata I/O error: block 0x60 ("xfs_trans_read_buf") error 5 buf count 4096
[  999.842094] XFS (md0p1): Ending clean mount
[ 1000.867508] XFS (md0p1): Mounting Filesystem
[ 1000.895354] XFS (md0p1): metadata I/O error: block 0x40 ("xfs_trans_read_buf") error 5 buf count 8192 [ 1000.958168] XFS (md0p1): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
[ 1000.958172] XFS (md0p1): failed to read root inode
[ 1000.958240] XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 272
[ 1001.022557] ------------[ cut here ]------------
[ 1001.054458] kernel BUG at fs/xfs/xfs_message.c:101!
[ 1001.054461] invalid opcode: 0000 [#1] SMP
[ 1001.054464] CPU 1
[ 1001.054465] Modules linked in: iscsi_scst(O) scst_vdisk(O) scst(O) libcrc32c ext2 drbd(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bonding sg e1000e(O) usbserial uhci_hcd ohci_hcd ehci_hcd aufs [last unloaded: ohci_hcd]
[ 1001.054477]
[ 1001.054479] Pid: 18813, comm: mount Tainted: G O 3.2.42-oe64-00000-gd572330 #1 Supermicro C2SBC-Q/C2SBC-Q [ 1001.054482] RIP: 0010:[<ffffffff812ff8ed>] [<ffffffff812ff8ed>] assfail+0x1d/0x30
[ 1001.054488] RSP: 0000:ffff880112849c68  EFLAGS: 00010296
[ 1001.054490] RAX: 0000000000000071 RBX: ffff880108d77840 RCX: ffff88013b006c00 [ 1001.054491] RDX: 00000000000000c2 RSI: 0000000000000000 RDI: ffffffff81ded518 [ 1001.054493] RBP: ffff880112849c68 R08: ffff88013b006c00 R09: ffff88013fd120c0 [ 1001.054494] R10: 0000000000000000 R11: 0000000000000002 R12: ffff88013988d000 [ 1001.054496] R13: 0000000000000000 R14: ffff88013988d1e8 R15: ffff88013988d1f8 [ 1001.054498] FS: 0000000000000000(0000) GS:ffff88013fd00000(0063) knlGS:00000000f75386c0
[ 1001.054499] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 1001.054501] CR2: 00000000f75f5540 CR3: 0000000135d33000 CR4: 00000000000406e0 [ 1001.054502] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1001.054504] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1001.054506] Process mount (pid: 18813, threadinfo ffff880112848000, task ffff8801355cb410)
[ 1001.054507] Stack:
[ 1001.054508] ffff880112849ca8 ffffffff8134eb7a 0000000000000005 ffff88013988d000 [ 1001.054510] ffff88013988d018 0000000000000005 ffff88013988d000 0000000000000000 [ 1001.054513] ffff880112849cf8 ffffffff8135054d 0000000000000000 0000000000000000
[ 1001.054515] Call Trace:
[ 1001.054519]  [<ffffffff8134eb7a>] xfs_free_perag+0x8a/0xc0
[ 1001.054521]  [<ffffffff8135054d>] xfs_mountfs+0x31d/0x700
[ 1001.054524]  [<ffffffff81301fab>] xfs_fs_fill_super+0x1cb/0x270
[ 1001.054527]  [<ffffffff811476da>] mount_bdev+0x19a/0x1d0
[ 1001.054529]  [<ffffffff81301de0>] ? xfs_fs_write_inode+0x180/0x180
[ 1001.054533]  [<ffffffff8138b006>] ? selinux_sb_copy_data+0x156/0x1d0
[ 1001.054536]  [<ffffffff81300200>] xfs_fs_mount+0x10/0x20
[ 1001.054538]  [<ffffffff81146df1>] mount_fs+0x41/0x180
[ 1001.054541]  [<ffffffff8115f7ae>] vfs_kern_mount+0x5e/0xc0
[ 1001.054543]  [<ffffffff8116075e>] do_kern_mount+0x4e/0x100
[ 1001.054545]  [<ffffffff81161f26>] do_mount+0x516/0x740
[ 1001.054548]  [<ffffffff811064e9>] ? __get_free_pages+0x9/0x40
[ 1001.054551]  [<ffffffff81187cc2>] compat_sys_mount+0xa2/0x220
[ 1001.054554]  [<ffffffff8178be43>] ia32_do_call+0x13/0x13
[ 1001.054555] Code: 66 66 90 66 66 66 90 66 66 66 90 66 66 90 55 41 89 d0 48 89 f1 48 89 fa 48 c7 c6 b8 bf 9b 81 31 ff 48 89 e5 31 c0 e8 53 ff ff ff <0f> 0b eb fe 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 31
[ 1001.054571] RIP  [<ffffffff812ff8ed>] assfail+0x1d/0x30
[ 1001.054573]  RSP <ffff880112849c68>
[ 1001.054575] ---[ end trace 9fa869a5d6931100 ]---

I also reproduced this call trace on kernel 3.4.39

[  486.338190] md: bind<sda>
[  486.342375] bio: create slab <bio-1> at 1
[  486.342387] md0: detected capacity change from 0 to 500107771904
[  486.354250]  md0: p1
[  486.709863] XFS (md0p1): Mounting Filesystem
[  486.738741] XFS (md0p1): Ending clean mount
[  486.739136] XFS (md0p1): Quotacheck needed: Please wait.
[  486.767727] XFS (md0p1): Quotacheck: Done.
[  487.808918] XFS (md0p1): Mounting Filesystem
[ 487.809500] XFS (md0p1): metadata I/O error: block 0x1905f ("xlog_bread_noalign") error 5 buf count 512
[  487.871589] XFS (md0p1): empty log check failed
[  487.871591] XFS (md0p1): log mount/recovery failed: error 5
[  487.871667] XFS (md0p1): log mount failed
[  488.899680] XFS (md0p1): Mounting Filesystem
[  488.917484] XFS (md0p1): Ending clean mount
[  489.943430] XFS (md0p1): last sector read failed
[  490.957577] XFS (md0p1): Mounting Filesystem
[ 490.975022] XFS (md0p1): metadata I/O error: block 0x60 ("xfs_trans_read_buf") error 5 buf count 4096
[  491.037209] XFS (md0p1): Ending clean mount
[  494.478260] Buffer I/O error on device md0p1, logical block 23
[  494.479204] Buffer I/O error on device md0p1, logical block 63
[  494.479540] Buffer I/O error on device md0p1, logical block 15
[  498.577283] XFS (md0p1): Mounting Filesystem
[ 498.604225] XFS (md0p1): metadata I/O error: block 0x60 ("xfs_trans_read_buf") error 5 buf count 4096
[  498.666825] XFS (md0p1): Ending clean mount
[  499.693310] XFS (md0p1): Mounting Filesystem
[ 499.719749] XFS (md0p1): metadata I/O error: block 0x40 ("xfs_trans_read_buf") error 5 buf count 8192 [ 499.782405] XFS (md0p1): xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
[  499.782409] XFS (md0p1): failed to read root inode
[ 499.782482] XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 272
[  499.846615] ------------[ cut here ]------------
[  499.878410] kernel BUG at fs/xfs/xfs_message.c:101!
[  499.878412] invalid opcode: 0000 [#1] SMP
[  499.878415] CPU 0
[ 499.878416] Modules linked in: iscsi_scst(O) scst_vdisk(O) scst(O) libcrc32c ext2 drbd(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bonding sg e1000e(O) usbserial uhci_hcd ohci_hcd ehci_hcd aufs [last unloaded: ohci_hcd]
[  499.878426]
[ 499.878428] Pid: 17632, comm: mount Tainted: G O 3.4.39-oe64-00000-g8b0d7e5 #9 Supermicro C2SBC-Q/C2SBC-Q [ 499.878431] RIP: 0010:[<ffffffff812fe68d>] [<ffffffff812fe68d>] assfail+0x1d/0x30
[  499.878437] RSP: 0000:ffff880135381c78  EFLAGS: 00010296
[ 499.878439] RAX: 0000000000000071 RBX: ffff88013a5c0540 RCX: ffff88013b006c00 [ 499.878440] RDX: 00000000000000eb RSI: 0000000000000046 RDI: 0000000000000000 [ 499.878442] RBP: ffff880135381c78 R08: ffff88013b006c00 R09: ffff88013fc12440 [ 499.878443] R10: ffff88013fd124e8 R11: 0000000000000000 R12: ffff880139736000 [ 499.878445] R13: 0000000000000000 R14: ffff8801397361e8 R15: ffff8801397361f8 [ 499.878446] FS: 0000000000000000(0000) GS:ffff88013fc00000(0063) knlGS:00000000f75146c0
[  499.878448] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 499.878449] CR2: 00000000f7751181 CR3: 00000001359b4000 CR4: 00000000000407f0 [ 499.878451] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 499.878452] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 499.878454] Process mount (pid: 17632, threadinfo ffff880135380000, task ffff880135005120)
[  499.878455] Stack:
[ 499.878456] ffff880135381cb8 ffffffff8134e68b 0000000000000005 ffff880139736000 [ 499.878458] ffff880139736018 0000000000000005 ffff880139736000 ffff8801352a9c00 [ 499.878461] ffff880135381d08 ffffffff8135008d 0000000000000000 0000000000000000
[  499.878463] Call Trace:
[  499.878466]  [<ffffffff8134e68b>] xfs_free_perag+0x8b/0xc0
[  499.878469]  [<ffffffff8135008d>] xfs_mountfs+0x31d/0x700
[  499.878471]  [<ffffffff81300dd0>] xfs_fs_fill_super+0x1e0/0x280
[  499.878474]  [<ffffffff8114dcab>] mount_bdev+0x19b/0x1d0
[  499.878476]  [<ffffffff81300bf0>] ? xfs_fs_evict_inode+0x130/0x130
[  499.878479]  [<ffffffff81388f46>] ? selinux_sb_copy_data+0x156/0x1d0
[  499.878481]  [<ffffffff812ff050>] xfs_fs_mount+0x10/0x20
[  499.878483]  [<ffffffff8114d341>] mount_fs+0x41/0x180
[  499.878486]  [<ffffffff81166f99>] vfs_kern_mount+0x69/0xf0
[  499.878488]  [<ffffffff811670ae>] do_kern_mount+0x4e/0x100
[  499.878490]  [<ffffffff8116805a>] do_mount+0x51a/0x760
[  499.878492]  [<ffffffff8110bee9>] ? __get_free_pages+0x9/0x40
[  499.878496]  [<ffffffff8118ee32>] compat_sys_mount+0xa2/0x220
[  499.878498]  [<ffffffff8108694f>] ? sys_rt_sigprocmask+0xbf/0xd0
[  499.878501]  [<ffffffff81794169>] ia32_do_call+0x13/0x13
[ 499.878502] Code: 66 66 90 66 66 66 90 66 66 66 90 66 66 90 55 41 89 d0 48 89 f1 48 89 fa 48 c7 c6 d0 cb 9b 81 31 ff 48 89 e5 31 c0 e8 53 ff ff ff <0f> 0b eb fe 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 31
[  499.878518] RIP  [<ffffffff812fe68d>] assfail+0x1d/0x30
[  499.878520]  RSP <ffff880135381c78>
[  499.878522] ---[ end trace 37e88031b68311f3 ]---

--

Best regards
Arkadiusz Bubała


_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs





[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux