Re: cephfs Kernel panic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you! That's it. I have installed the Kernel from the Jessie
backport. Now the crashes are gone.
How often do these things happen? It would be a worst case scenario, if
a system update breaks a productive system.

Best
Simon

Am 11.04.2016 um 16:58 schrieb Ilya Dryomov:
> On Mon, Apr 11, 2016 at 4:37 PM, Simon Ferber
> <ferber@xxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>> Hi,
>>
>> I try to setup an ceph cluster on Debian 8.4. Mainly I followed a
>> tutorial at
>> http://adminforge.de/raid/ceph/ceph-cluster-unter-debian-wheezy-installieren/
>>
>> As far as I can see, the first steps are just working fine. I have two
>> nodes with four OSD on both nodes.
>> This is the output of ceph -s
>>
>>     cluster 2a028d5e-5708-4fc4-9c0d-3495c1a3ef3d
>>      health HEALTH_OK
>>      monmap e2: 2 mons at
>> {ollie2=129.217.207.207:6789/0,stan2=129.217.207.206:6789/0}
>>             election epoch 12, quorum 0,1 stan2,ollie2
>>      mdsmap e10: 1/1/1 up {0=ollie2=up:active}, 1 up:standby
>>      osdmap e72: 8 osds: 8 up, 8 in
>>             flags sortbitwise
>>       pgmap v137: 428 pgs, 4 pools, 2396 bytes data, 20 objects
>>             281 MB used, 14856 GB / 14856 GB avail
>>                  428 active+clean
>>
>> Then I tried to add cephfs following the manual at
>> http://docs.ceph.com/docs/hammer/cephfs/createfs/ which seem to do it's
>> magic:
>> root@stan2:~# ceph fs ls
>> name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
>>
>> However, as soon as I try to mount the cephfs with mount.ceph
>> 129.217.207.206:6789:/ /mnt/ -v -o
>> name=cephfs,secretfile=/etc/ceph/client.cephfs the server which tries to
>> mount crashes and has to be cold started again. To be able to use
>> mount.ceph I had to install ceph-fs-common - if that does matter...
>>
>> Here is the kernel.log. Can you give me hints? I am pretty stuck on this
>> for the last few days.
>>
>> Apr 11 16:25:02 stan2 kernel: [  171.086381] Key type ceph registered
>> Apr 11 16:25:02 stan2 kernel: [  171.086649] libceph: loaded (mon/osd
>> proto 15/24)
>> Apr 11 16:25:02 stan2 kernel: [  171.090582] FS-Cache: Netfs 'ceph'
>> registered for caching
>> Apr 11 16:25:02 stan2 kernel: [  171.090596] ceph: loaded (mds proto 32)
>> Apr 11 16:25:02 stan2 kernel: [  171.096727] libceph: client34164 fsid
>> 2a028d5e-5708-4fc4-9c0d-3495c1a3ef3d
>> Apr 11 16:25:02 stan2 kernel: [  171.133832] libceph: mon0
>> 129.217.207.206:6789 session established
>> Apr 11 16:25:02 stan2 kernel: [  171.161199] ------------[ cut here
>> ]------------
>> Apr 11 16:25:02 stan2 kernel: [  171.161239] kernel BUG at
>> /build/linux-lqALYs/linux-3.16.7-ckt25/fs/ceph/mds_client.c:1846!
>> Apr 11 16:25:02 stan2 kernel: [  171.161294] invalid opcode: 0000 [#1] SMP
>> Apr 11 16:25:02 stan2 kernel: [  171.161328] Modules linked in: cbc ceph
>> libceph xfs libcrc32c crc32c_generic binfmt_misc mptctl mptbase nfsd
>> auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc nls_utf8
>> nls_cp437 vfat fat x86_pkg_temp_thermal intel_powerclamp intel_rapl
>> coretemp kvm_intel kvm crc32_pclmul cryptd iTCO_wdt iTCO_vendor_support
>> efi_pstore efivars pcspkr joydev evdev ast i2c_i801 ttm drm_kms_helper
>> drm lpc_ich mfd_core mei_me mei shpchp ioatdma tpm_tis wmi tpm ipmi_si
>> ipmi_msghandler processor thermal_sys acpi_power_meter button acpi_pad
>> fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod hid_generic sg
>> usbhid hid sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul
>> crct10dif_common crc32c_intel ahci libahci ehci_pci mpt3sas igb
>> raid_class i2c_algo_bit xhci_hcd libata ehci_hcd scsi_transport_sas
>> i2c_core dca usbcore ptp usb_common scsi_mod pps_core
>> Apr 11 16:25:02 stan2 kernel: [  171.162046] CPU: 0 PID: 3513 Comm:
>> kworker/0:9 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2
>> Apr 11 16:25:02 stan2 kernel: [  171.162104] Hardware name: Supermicro
>> SYS-6028R-WTR/X10DRW-i, BIOS 1.0c 01/07/2015
>> Apr 11 16:25:02 stan2 kernel: [  171.162158] Workqueue: ceph-msgr
>> con_work [libceph]
>> Apr 11 16:25:02 stan2 kernel: [  171.162194] task: ffff88103f2e8ae0 ti:
>> ffff88103bfbc000 task.ti: ffff88103bfbc000
>> Apr 11 16:25:02 stan2 kernel: [  171.162243] RIP:
>> 0010:[<ffffffffa0733071>]  [<ffffffffa0733071>]
>> __prepare_send_request+0x801/0x810 [ceph]
>> Apr 11 16:25:02 stan2 kernel: [  171.162312] RSP: 0018:ffff88103bfbfba8
>> EFLAGS: 00010283
>> Apr 11 16:25:02 stan2 kernel: [  171.162347] RAX: ffff88103f88ad42 RBX:
>> ffff88103f7f7400 RCX: 0000000000000000
>> Apr 11 16:25:02 stan2 kernel: [  171.162394] RDX: 00000000164c5ec6 RSI:
>> 0000000000000000 RDI: ffff88103f88ad32
>> Apr 11 16:25:02 stan2 kernel: [  171.162440] RBP: ffff88103f7f95e0 R08:
>> 0000000000000000 R09: 0000000000000000
>> Apr 11 16:25:02 stan2 kernel: [  171.162485] R10: 0000000000000000 R11:
>> 000000000000002c R12: ffff88103f7f7c00
>> Apr 11 16:25:02 stan2 kernel: [  171.162531] R13: ffff88103f88acc0 R14:
>> 0000000000000000 R15: ffff88103f88ad3a
>> Apr 11 16:25:02 stan2 kernel: [  171.162578] FS:  0000000000000000(0000)
>> GS:ffff88107fc00000(0000) knlGS:0000000000000000
>> Apr 11 16:25:02 stan2 kernel: [  171.162629] CS:  0010 DS: 0000 ES: 0000
>> CR0: 0000000080050033
>> Apr 11 16:25:02 stan2 kernel: [  171.162668] CR2: 00007fa73ca0a000 CR3:
>> 0000000001a13000 CR4: 00000000001407f0
>> Apr 11 16:25:02 stan2 kernel: [  171.162713] Stack:
>> Apr 11 16:25:02 stan2 kernel: [  171.162730]  ffff88103bfbfbd4
>> ffff88103ef39540 0000000000000001 0000000000000000
>> Apr 11 16:25:02 stan2 kernel: [  171.162787]  0000000000000000
>> 0000000000000000 ffff88103ef39540 0000000000000000
>> Apr 11 16:25:02 stan2 kernel: [  171.162845]  0000000000000001
>> 0000000000000000 ffff88103f88ad42 ffff88103f7f7400
>> Apr 11 16:25:02 stan2 kernel: [  171.162903] Call Trace:
>> Apr 11 16:25:02 stan2 kernel: [  171.162928]  [<ffffffffa073328f>] ?
>> __do_request+0x20f/0x310 [ceph]
>> Apr 11 16:25:02 stan2 kernel: [  171.162974]  [<ffffffffa0733400>] ?
>> __wake_requests+0x70/0xb0 [ceph]
>> Apr 11 16:25:02 stan2 kernel: [  171.163021]  [<ffffffffa0735750>] ?
>> dispatch+0x440/0x1780 [ceph]
>> Apr 11 16:25:02 stan2 kernel: [  171.163065]  [<ffffffffa06e57f7>] ?
>> ceph_tcp_recvmsg+0x47/0x60 [libceph]
>> Apr 11 16:25:02 stan2 kernel: [  171.163112]  [<ffffffffa06e97ce>] ?
>> con_work+0x18fe/0x2c40 [libceph]
>> Apr 11 16:25:02 stan2 kernel: [  171.163159]  [<ffffffff8101b975>] ?
>> sched_clock+0x5/0x10
>> Apr 11 16:25:02 stan2 kernel: [  171.163197]  [<ffffffff810817c2>] ?
>> process_one_work+0x172/0x420
>> Apr 11 16:25:02 stan2 kernel: [  171.163238]  [<ffffffff81081e53>] ?
>> worker_thread+0x113/0x4f0
>> Apr 11 16:25:02 stan2 kernel: [  171.163277]  [<ffffffff81081d40>] ?
>> rescuer_thread+0x2d0/0x2d0
>> Apr 11 16:25:02 stan2 kernel: [  171.163318]  [<ffffffff8108809d>] ?
>> kthread+0xbd/0xe0
>> Apr 11 16:25:02 stan2 kernel: [  171.163355]  [<ffffffff81087fe0>] ?
>> kthread_create_on_node+0x180/0x180
>> Apr 11 16:25:02 stan2 kernel: [  171.163402]  [<ffffffff81514958>] ?
>> ret_from_fork+0x58/0x90
>> Apr 11 16:25:02 stan2 kernel: [  171.163441]  [<ffffffff81087fe0>] ?
>> kthread_create_on_node+0x180/0x180
>> Apr 11 16:25:02 stan2 kernel: [  171.163485] Code: 49 8b 8d a0 fc ff ff
>> 4d 8b 85 a8 fc ff ff 4c 89 ea 48 c7 c6 64 89 74 a0 48 c7 c7 38 0e 75 a0
>> 31 c0 e8 c4 d6 b9 e0 e9 a8 f9 ff ff <0f> 0b 0f 0b 66 66 2e 0f 1f 84 00
>> 00 00 00 00 0f 1f 44 00 00 41
>> Apr 11 16:25:02 stan2 kernel: [  171.163758] RIP  [<ffffffffa0733071>]
>> __prepare_send_request+0x801/0x810 [ceph]
>> Apr 11 16:25:02 stan2 kernel: [  171.163812]  RSP <ffff88103bfbfba8>
>> Apr 11 16:25:02 stan2 kernel: [  171.163839] ---[ end trace
>> 797bb0683850e8df ]---
>> Apr 11 16:25:02 stan2 kernel: [  171.167929] BUG: unable to handle
>> kernel paging request at ffffffffffffffd8
>> Apr 11 16:25:02 stan2 kernel: [  171.167985] IP: [<ffffffff8108862c>]
>> kthread_data+0xc/0x20
> 
> Hi,
> 
> This is an accidental breakage caused by a backport of a patch that
> wasn't explicitly marked for backporting.  Your 3.16.7-ckt25-2 is one
> of the affected kernels.  Ubuntu kernel team has the fix queued - it's
> about to be released.  You can also grab an unreleased kernel with the
> fix applied from one of the links in the ceph-devel thread [1].  See
> 
> [1] http://www.spinics.net/lists/ceph-devel/msg29504.html
> [2] http://tracker.ceph.com/issues/15302
> 
> Thanks,
> 
>                 Ilya
> 


-- 
Simon Ferber
Techniker

Technische Universität Dortmund
Fakultät Statistik
Vogelpothsweg 87
44227 Dortmund

Tel.: +49 231-755 3188
Fax: +49 231-755 5305
simon.ferber@xxxxxxxxxxxxxx
www.tu-dortmund.de


Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie
ist ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der
für diese E-Mail bestimmte Adressat sein, unterrichten Sie bitte den
Absender und vernichten Sie diese Mail. Vielen Dank.
Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen
ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher
Schriftform (mit eigenhändiger Unterschrift) oder durch Übermittlung
eines solchen Schriftstücks per Telefax erfolgen.

Important note: The information included in this e-mail is confidential.
It is solely intended for the recipient. If you are not the intended
recipient of this e-mail please contact the sender and delete this
message. Thank you.
Without prejudice of e-mail correspondence, our statements are only
legally binding when they are made in the conventional written form
(with personal signature) or when such documents are sent by fax.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux