GPF kernel panics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I've had a fun time with ceph this week.
We have a cluster with 4 OSD (20 OSD's per) servers, 3 mons and a server
mapping ~200 rbd's and presenting cifs shares.

We're using cephx and the export node has its own cephx auth key.

I made a change to the key last week, adding rwx access to another pool.

Since that point, we had sporadic kernel panics of the export node.

It got to the point where it would barely finish booting up and would panic.

Once I removed the extra pool I had added to the auth key, it hasn't
crashed again.

I'm a bit concerned that a change to an auth key can cause this type of
crash.
There were no log entries on mon/osd/export node regarding the key at all,
so it was only by searching my memory for what had changed that allowed me
to resolve the problem.

>From what I could tell from the key, the format was correct and the pool
that I added did exist, so I am confused as to how this would have caused
kernel panics.

Below is an example of one of the crash stacktraces.

[   32.713504] general protection fault: 0000 [#1] SMP
[   32.724718] Modules linked in: ipt_REJECT xt_tcpudp iptable_filter
ip_tables x_tables rbd libceph libcrc32c gpio_ich dcdbas intel_rapl
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm
crct10dif_pclmul joydev crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac edac_core
shpchp lpc_ich mei_me mei wmi ipmi_si mac_hid acpi_power_meter 8021q garp
stp mrp llc bonding lp parport nfsd auth_rpcgss nfs_acl nfs lockd sunrpc
fscache hid_generic igb ixgbe i2c_algo_bit usbhid dca hid ptp ahci libahci
pps_core megaraid_sas mdio
[   32.843936] CPU: 18 PID: 5030 Comm: tr Not tainted 3.13.0-30-generic
#54-Ubuntu
[   32.860163] Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 1.6.0
03/07/2013
[   32.876774] task: ffff880417b15fc0 ti: ffff8804273f4000 task.ti:
ffff8804273f4000
[   32.893384] RIP: 0010:[<ffffffff811a19c5>]  [<ffffffff811a19c5>]
kmem_cache_alloc+0x75/0x1e0
[   32.912198] RSP: 0018:ffff8804273f5d40  EFLAGS: 00010286
[   32.924015] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
00000000000011ed
[   32.939856] RDX: 00000000000011ec RSI: 00000000000080d0 RDI:
ffff88042f803700
[   32.955696] RBP: ffff8804273f5d70 R08: 0000000000017260 R09:
ffffffff811be63c
[   32.971559] R10: 8080808080808080 R11: 0000000000000000 R12:
7d10f8ec0c3cb928
[   32.987421] R13: 00000000000080d0 R14: ffff88042f803700 R15:
ffff88042f803700
[   33.003284] FS:  0000000000000000(0000) GS:ffff88042fd20000(0000)
knlGS:0000000000000000
[   33.021281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   33.034068] CR2: 00007f01a8fced40 CR3: 000000040e52f000 CR4:
00000000000407e0
[   33.049929] Stack:
[   33.054456]  ffffffff811be63c 0000000000000000 ffff88041be52780
ffff880428052000
[   33.071259]  ffff8804273f5f2c 00000000ffffff9c ffff8804273f5d98
ffffffff811be63c
[   33.088084]  0000000000000080 ffff8804273f5f2c ffff8804273f5e40
ffff8804273f5e30
[   33.104908] Call Trace:
[   33.110399]  [<ffffffff811be63c>] ? get_empty_filp+0x5c/0x180
[   33.123188]  [<ffffffff811be63c>] get_empty_filp+0x5c/0x180
[   33.135593]  [<ffffffff811cc03d>] path_openat+0x3d/0x620
[   33.147422]  [<ffffffff811cd47a>] do_filp_open+0x3a/0x90
[   33.159250]  [<ffffffff811a1985>] ? kmem_cache_alloc+0x35/0x1e0
[   33.172405]  [<ffffffff811cc6bf>] ? getname_flags+0x4f/0x190
[   33.185004]  [<ffffffff811da237>] ? __alloc_fd+0xa7/0x130
[   33.197025]  [<ffffffff811bbb99>] do_sys_open+0x129/0x280
[   33.209049]  [<ffffffff81020d25>] ? syscall_trace_enter+0x145/0x250
[   33.222992]  [<ffffffff811bbd0e>] SyS_open+0x1e/0x20
[   33.234053]  [<ffffffff8172aeff>] tracesys+0xe1/0xe6
[   33.245112] Code: dc 00 00 49 8b 50 08 4d 8b 20 49 8b 40 10 4d 85 e4 0f
84 17 01 00 00 48 85 c0 0f 84 0e 01 00 00 49 63 46 20 48 8d 4a 01 4d 8b 06
<49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 b9 49 63
[   33.292549] RIP  [<ffffffff811a19c5>] kmem_cache_alloc+0x75/0x1e0
[   33.306192]  RSP <ffff8804273f5d40>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140731/88eb205c/attachment.htm>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux