Re: Cephfs write fail when node goes down

David C <dcsysengineer@xxxxxxxxx> · Tue, 15 May 2018 16:56:51 +0200

I've seen similar behavior with cephfs client around that age, try 4.14+

On 15 May 2018 1:57 p.m., "Josef Zelenka" <josef.zelenka@xxxxxxxxxxxxxxxx> wrote:
Client's kernel is 4.4.0. Regarding the hung osd request, i'll have to 

check, the issue is gone now, so i'm not sure if i'll find what you are 

suggesting. It's rather odd, because Ceph's failover worked for us every 

time, so i'm trying to figure out whether it is a ceph or app issue.

On 15/05/18 02:57, Yan, Zheng wrote:

> On Mon, May 14, 2018 at 5:37 PM, Josef Zelenka

> <josef.zelenka@xxxxxxxxxxxxxxxx> wrote:

>> Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48

>> OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0). Yesterday,

>> we were doing a HW upgrade of the nodes, so they went down one by one - the

>> cluster was in good shape during the upgrade, as we've done this numerous

>> times and we're quite sure that the redundancy wasn't screwed up while doing

>> this. However, during this upgrade one of the clients that does backups to

>> cephfs(mounted via the kernel driver) failed to write the backup file

>> correctly to the cluster with the following trace after we turned off one of

>> the nodes:

>>

>> [2585732.529412]  ffff8800baa279a8 ffffffff813fb2df ffff880236230e00

>> ffff8802339c0000

>> [2585732.529414]  ffff8800baa28000 ffff88023fc96e00 7fffffffffffffff

>> ffff8800baa27b20

>> [2585732.529415]  ffffffff81840ed0 ffff8800baa279c0 ffffffff818406d5

>> 0000000000000000

>> [2585732.529417] Call Trace:

>> [2585732.529505]  [<ffffffff813fb2df>] ? cpumask_next_and+0x2f/0x40

>> [2585732.529558]  [<ffffffff81840ed0>] ? bit_wait+0x60/0x60

>> [2585732.529560]  [<ffffffff818406d5>] schedule+0x35/0x80

>> [2585732.529562]  [<ffffffff81843825>] schedule_timeout+0x1b5/0x270

>> [2585732.529607]  [<ffffffff810642be>] ? kvm_clock_get_cycles+0x1e/0x20

>> [2585732.529609]  [<ffffffff81840ed0>] ? bit_wait+0x60/0x60

>> [2585732.529611]  [<ffffffff8183fc04>] io_schedule_timeout+0xa4/0x110

>> [2585732.529613]  [<ffffffff81840eeb>] bit_wait_io+0x1b/0x70

>> [2585732.529614]  [<ffffffff81840c6e>] __wait_on_bit_lock+0x4e/0xb0

>> [2585732.529652]  [<ffffffff8118f3cb>] __lock_page+0xbb/0xe0

>> [2585732.529674]  [<ffffffff810c4460>] ? autoremove_wake_function+0x40/0x40

>> [2585732.529676]  [<ffffffff8119078d>] pagecache_get_page+0x17d/0x1c0

>> [2585732.529730]  [<ffffffffc056b3a8>] ? ceph_pool_perm_check+0x48/0x700

>> [ceph]

>> [2585732.529732]  [<ffffffff811907f6>] grab_cache_page_write_begin+0x26/0x40

>> [2585732.529738]  [<ffffffffc056a6a8>] ceph_write_begin+0x48/0xe0 [ceph]

>> [2585732.529739]  [<ffffffff8118fd6e>] generic_perform_write+0xce/0x1c0

>> [2585732.529763]  [<ffffffff8122bdb9>] ? file_update_time+0xc9/0x110

>> [2585732.529769]  [<ffffffffc05651c9>] ceph_write_iter+0xf89/0x1040 [ceph]

>> [2585732.529792]  [<ffffffff81199c19>] ? __alloc_pages_nodemask+0x159/0x2a0

>> [2585732.529808]  [<ffffffff8120fedb>] new_sync_write+0x9b/0xe0

>> [2585732.529811]  [<ffffffff8120ff46>] __vfs_write+0x26/0x40

>> [2585732.529812]  [<ffffffff812108c9>] vfs_write+0xa9/0x1a0

>> [2585732.529814]  [<ffffffff81211585>] SyS_write+0x55/0xc0

>> [2585732.529817]  [<ffffffff818447f2>] entry_SYSCALL_64_fastpath+0x16/0x71

>>

>>

> is there any hang osd request in /sys/kernel/debug/ceph/xxxx/osdc?

>

>> I have encountered this behavior on Luminous, but not on Jewel. Anyone who

>> has a clue why the write fails? As far as i'm concerned, it should always

>> work if all the PGs are available. Thanks

>> Josef

>>

>> _______________________________________________

>> ceph-users mailing list

>> ceph-users@xxxxxxxxxxxxxx

>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com