Frequent Crashes on rbd to nfs gateway Server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 19, 2014 at 11:22 AM, Micha Krause <micha at krausam.de> wrote:
> Hi,
>
>> I have build an NFS Server based on Sebastiens Blog Post here:
>>
>> http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/
>>
>> Im using Kernel 3.14-0.bpo.1-amd64 on Debian wheezy, the host is a VM on
>> Vmware.
>>
>> Using rsync im writing data via nfs from one client to this Server.
>>
>> The NFS Server crashes multiple times per day, I can't even login to the
>> Server then.
>> After a reset, there is no kernel log about the crash, so I guess
>> something is blocking
>> all I/Os.
>
>
> Ok, it seems that I just can't get a shell, but I can run commands via ssh
> directly.

So does it actually crash or it's just the blocked I/Os?  If it doesn't
crash, you should be able to get everything off dmesg.

>
> I was able to get the following informations:
>
> dmesg:
>
> [18102.981064] INFO: task nfsd:2769 blocked for more than 120 seconds.
> [18102.981112]       Not tainted 3.14-0.bpo.1-amd64 #1
> [18102.981150] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [18102.981216] nfsd            D ffff88003fc14340     0  2769      2
> 0x00000000
> [18102.981218]  ffff88003bac6e20 0000000000000046 0000000000000000
> ffff88003d47ada0
> [18102.981219]  0000000000014340 ffff88003ce31fd8 0000000000014340
> ffff88003bac6e20
> [18102.981221]  ffff88003ce31728 ffff8800029539f0 7fffffffffffffff
> 7fffffffffffffff
> [18102.981223] Call Trace:
> [18102.981225]  [<ffffffff814eedbd>] ? schedule_timeout+0x1ed/0x250
> [18102.981231]  [<ffffffffa04b0f92>] ? _xfs_buf_find+0xd2/0x280 [xfs]
> [18102.981234]  [<ffffffff8117fc2c>] ? kmem_cache_alloc+0x1bc/0x1f0
> [18102.981236]  [<ffffffff814f193c>] ? __down_common+0x97/0xea
> [18102.981241]  [<ffffffffa04b0faa>] ? _xfs_buf_find+0xea/0x280 [xfs]
> [18102.981243]  [<ffffffff810aa697>] ? down+0x37/0x40
> [18102.981247]  [<ffffffffa04b0e02>] ? xfs_buf_lock+0x32/0xf0 [xfs]
> [18102.981252]  [<ffffffffa04b0faa>] ? _xfs_buf_find+0xea/0x280 [xfs]
> [18102.981257]  [<ffffffffa04b1215>] ? xfs_buf_get_map+0x35/0x1a0 [xfs]
> [18102.981263]  [<ffffffffa04b2153>] ? xfs_buf_read_map+0x33/0x130 [xfs]
> [18102.981269]  [<ffffffffa05161da>] ? xfs_trans_read_buf_map+0x34a/0x4f0
> [xfs]
> [18102.981275]  [<ffffffffa05036f9>] ? xfs_imap_to_bp+0x69/0xf0 [xfs]
> [18102.981281]  [<ffffffffa0503bcd>] ? xfs_iread+0x7d/0x3f0 [xfs]
> [18102.981284]  [<ffffffff810e8939>] ? make_kgid+0x9/0x10
> [18102.981286]  [<ffffffff811b148e>] ? inode_init_always+0x10e/0x1d0
> [18102.981292]  [<ffffffffa04ba11a>] ? xfs_iget+0x2ba/0x810 [xfs]
> [18102.981298]  [<ffffffffa04fd9a6>] ? xfs_ialloc+0xe6/0x740 [xfs]
> [18102.981305]  [<ffffffffa04ca1ee>] ? kmem_zone_alloc+0x6e/0xf0 [xfs]
> [18102.981311]  [<ffffffffa04fe083>] ? xfs_dir_ialloc+0x83/0x300 [xfs]
> [18102.981317]  [<ffffffffa04c8e43>] ? xfs_trans_reserve+0x213/0x220 [xfs]
> [18102.981323]  [<ffffffffa04fe87e>] ? xfs_create+0x4fe/0x720 [xfs]
> [18102.981329]  [<ffffffffa04bfd02>] ? xfs_vn_mknod+0xd2/0x200 [xfs]
> [18102.981331]  [<ffffffff811a6b54>] ? vfs_create+0xe4/0x160
> [18102.981335]  [<ffffffffa0400d9e>] ? do_nfsd_create+0x53e/0x610 [nfsd]
> [18102.981339]  [<ffffffffa0407f4d>] ? nfsd3_proc_create+0x16d/0x250 [nfsd]
> [18102.981342]  [<ffffffffa03f9d74>] ? nfsd_dispatch+0xe4/0x230 [nfsd]
> [18102.981347]  [<ffffffffa035dd64>] ? svc_process_common+0x354/0x690
> [sunrpc]
> [18102.981349]  [<ffffffff81096ab0>] ? try_to_wake_up+0x280/0x280
> [18102.981353]  [<ffffffffa035e3fb>] ? svc_process+0x10b/0x160 [sunrpc]
> [18102.981359]  [<ffffffffa03f96d7>] ? nfsd+0xb7/0x130 [nfsd]
> [18102.981363]  [<ffffffffa03f9620>] ? nfsd_destroy+0x70/0x70 [nfsd]
> [18102.981365]  [<ffffffff81086d6c>] ? kthread+0xbc/0xe0
> [18102.981367]  [<ffffffff81086cb0>] ? flush_kthread_worker+0xa0/0xa0
> [18102.981369]  [<ffffffff814faecc>] ? ret_from_fork+0x7c/0xb0
> [18102.981371]  [<ffffffff81086cb0>] ? flush_kthread_worker+0xa0/0xa0

Is that the only hung task in dmesg?

>
> iostat:
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.00    0.00    1.00   99.00    0.00    0.00
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 0.00    0.00    0.00    0.00   0.00   0.00
> dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 0.00    0.00    0.00    0.00   0.00   0.00
> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 0.00    0.00    0.00    0.00   0.00   0.00
> dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 0.00    0.00    0.00    0.00   0.00   0.00
> dm-3              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 0.00    0.00    0.00    0.00   0.00   0.00
> dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 0.00    0.00    0.00    0.00   0.00   0.00
> rbd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 46.00    0.00    0.00    0.00   0.00 100.00
> rbd1              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 12.00    0.00    0.00    0.00   0.00 100.00
> rbd2              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 136.00    0.00    0.00    0.00   0.00 100.00
> rbd3              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 0.00    0.00    0.00    0.00   0.00   0.00
> rbd4              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 11.00    0.00    0.00    0.00   0.00 100.00
> rbd5              0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 57.00    0.00    0.00    0.00   0.00 100.00
> emcpowerig        0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 32.00    0.00    0.00    0.00   0.00 100.00
> emcpowerhq        0.00     0.00    0.00    0.00     0.00     0.00     0.00
> 38.00    0.00    0.00    0.00   0.00 100.00
>
> (for some reason rbd6 and rbd7 are shown as emcpower in iostat, no idea why)
>
> cat /sys/kernel/debug/ceph/*/*
>
> have osdmap 39226
> want next osdmap
> epoch 10
>         mon0    10.210.32.11:6789
>         mon1    10.210.33.11:6789
>         mon2    10.210.34.11:6789
> 582277  osd19   37.e52208ae     rb.0.1676178.2ae8944a.000000005ffd
> write
> 582278  osd2    37.f52b9433     rb.0.1676178.2ae8944a.000000006000
> write
> 582279  osd2    37.b3f0aae3     rb.0.1676178.2ae8944a.00000000641b
> write
> 582280  osd28   37.d8768bba     rb.0.1676178.2ae8944a.00000000641c
> write
> 582282  osd29   37.a923b4c6     rb.0.1676178.2ae8944a.000000008032
> write
> 582283  osd28   37.a2510620     rb.0.1676178.2ae8944a.000000008034
> write
> 582289  osd18   37.c96bc19d     rb.0.1345def.2ae8944a.0000001401cf
> write
> 582290  osd1    37.3edba98c     rb.0.165171e.238e1f29.000000039fe3
> write
> 582291  osd20   37.89f3f734     rb.0.1676160.238e1f29.0000000002ee
> write
> 582292  osd20   37.89f3f734     rb.0.1676160.238e1f29.0000000002ee
> write
> 582293  osd2    37.d34be89b     rb.0.1345def.2ae8944a.00000003c961
> read
> 582294  osd4    37.c611292f     rb.0.1345def.2ae8944a.00000007a19a
> read
> 582295  osd20   37.b1ae9634     rb.0.1345def.2ae8944a.00000003def7
> read
> 582296  osd19   37.3928a207     rb.0.1689cf4.238e1f29.000000034106
> write
> 582297  osd15   37.1011005e     rb.0.1689d69.238e1f29.000000034614
> write
> 582298  osd15   37.1011005e     rb.0.1689d69.238e1f29.000000034614
> write
> 582299  osd24   37.80a8ba78     rb.0.167614e.2ae8944a.000000026149
> write
> 582300  osd18   37.fe0a5354     rb.0.1689d69.238e1f29.0000000221bb
> write
> 582301  osd18   37.fe0a5354     rb.0.1689d69.238e1f29.0000000221bb
> write
> 582302  osd18   37.fe0a5354     rb.0.1689d69.238e1f29.0000000221bb
> write
> 582303  osd18   37.fe0a5354     rb.0.1689d69.238e1f29.0000000221bb
> write
> 582304  osd18   37.fe0a5354     rb.0.1689d69.238e1f29.0000000221bb
> write
> 582305  osd18   37.fe0a5354     rb.0.1689d69.238e1f29.0000000221bb
> write
> 582306  osd18   37.fe0a5354     rb.0.1689d69.238e1f29.0000000221bb
> write
> 582307  osd18   37.fe0a5354     rb.0.1689d69.238e1f29.0000000221bb
> write
> 582308  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582309  osd15   37.1011005e     rb.0.1689d69.238e1f29.000000034614
> write
> 582310  osd15   37.1011005e     rb.0.1689d69.238e1f29.000000034614
> write
> 582311  osd24   37.3c5f1cf8     rb.0.1689d69.238e1f29.0000000221b9
> write
> 582312  osd24   37.3c5f1cf8     rb.0.1689d69.238e1f29.0000000221b9
> write
> 582313  osd24   37.3c5f1cf8     rb.0.1689d69.238e1f29.0000000221b9
> write
> 582314  osd24   37.3c5f1cf8     rb.0.1689d69.238e1f29.0000000221b9
> write
> 582315  osd24   37.3c5f1cf8     rb.0.1689d69.238e1f29.0000000221b9
> write
> 582316  osd24   37.3c5f1cf8     rb.0.1689d69.238e1f29.0000000221b9
> write
> 582317  osd24   37.3c5f1cf8     rb.0.1689d69.238e1f29.0000000221b9
> write
> 582318  osd24   37.3c5f1cf8     rb.0.1689d69.238e1f29.0000000221b9
> write
> 582319  osd1    37.36930f4c     rb.0.1689d69.238e1f29.0000000221ba
> write
> 582320  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582321  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582322  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582323  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582324  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582325  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582326  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582327  osd2    37.d142a662     rb.0.1689d69.238e1f29.0000000221bc
> write
> 582328  osd27   37.886d4a41     rb.0.1689d69.238e1f29.0000000221bd
> write
> 582329  osd19   37.3928a207     rb.0.1689cf4.238e1f29.000000034106
> write
> 582330  osd23   37.834e04c8     rb.0.1689cf4.238e1f29.00000002485f
> write
> 582331  osd2    37.a7fb0062     rb.0.1689cf4.238e1f29.00000002bfea
> write
> 582332  osd2    37.a7fb0062     rb.0.1689cf4.238e1f29.00000002bfea
> write
> 582333  osd29   37.e3741a18     rb.0.1689cf4.238e1f29.00000002c0f6
> write
> 582334  osd19   37.dec7ad07     rb.0.1689cf4.238e1f29.000000031fe7
> write
> 582335  osd19   37.dec7ad07     rb.0.1689cf4.238e1f29.000000031fe7
> write
> 582336  osd26   37.3f51e7ac     rb.0.1689cf4.238e1f29.0000000320e7
> write
> 582337  osd18   37.536a8f6      rb.0.1689cf4.238e1f29.000000033fe6
> write
> 582338  osd18   37.536a8f6      rb.0.1689cf4.238e1f29.000000033fe6
> write
> 582339  osd18   37.e3d0fce6     rb.0.1689cf4.238e1f29.00000002005f
> write
> 582340  osd1    37.36930f4c     rb.0.1689d69.238e1f29.0000000221ba
> write
> 582341  osd1    37.36930f4c     rb.0.1689d69.238e1f29.0000000221ba
> write
> 582342  osd1    37.36930f4c     rb.0.1689d69.238e1f29.0000000221ba
> write
> 582343  osd1    37.36930f4c     rb.0.1689d69.238e1f29.0000000221ba
> write
> 582344  osd27   37.886d4a41     rb.0.1689d69.238e1f29.0000000221bd
> write
> 582345  osd27   37.886d4a41     rb.0.1689d69.238e1f29.0000000221bd
> write
> 582346  osd27   37.886d4a41     rb.0.1689d69.238e1f29.0000000221bd
> write
> 582347  osd16   37.9764b7df     rb.0.1676178.2ae8944a.000000016392
> write
> 582348  osd16   37.9764b7df     rb.0.1676178.2ae8944a.000000016392
> write
> 582349  osd16   37.9764b7df     rb.0.1676178.2ae8944a.000000016392
> write
> 582350  osd16   37.9764b7df     rb.0.1676178.2ae8944a.000000016392
> write
> 582351  osd16   37.9764b7df     rb.0.1676178.2ae8944a.000000016392
> write
> 582352  osd16   37.9764b7df     rb.0.1676178.2ae8944a.000000016392
> write
> 582353  osd4    37.1067846f     rb.0.1676178.2ae8944a.00000003a43c
> write
> 582354  osd26   37.ecccc929     rb.0.1676178.2ae8944a.00000003c703
> write
> 582355  osd18   37.ed91673e     rb.0.1676178.2ae8944a.00000003c704
> write
> 582356  osd23   37.58fdaf5a     rb.0.1676178.2ae8944a.00000003dfe1
> write
> 582357  osd23   37.58fdaf5a     rb.0.1676178.2ae8944a.00000003dfe1
> write
> 582358  osd23   37.a050d40f     rb.0.1676178.2ae8944a.00000003e0a4
> write
> 582359  osd1    37.c3f1f60c     rb.0.1676178.2ae8944a.00000003e24f
> write
> 582360  osd18   37.8ad6d326     rb.0.1676178.2ae8944a.00000003e50b
> write
> 582361  osd28   37.cdf2c9ba     rb.0.1676178.2ae8944a.00000003e50c
> write
> 582362  osd27   37.99cdab71     rb.0.1676178.2ae8944a.00000003e50d
> write
> 582363  osd27   37.99cdab71     rb.0.1676178.2ae8944a.00000003e50d
> write
> 582364  osd16   37.1eaec3ff     rb.0.1676178.2ae8944a.00000000641a
> write
> 582365  osd24   37.d881c8e      rb.0.1676178.2ae8944a.000000007ffc
> write
> 582366  osd24   37.d881c8e      rb.0.1676178.2ae8944a.000000007ffc
> write
> 582367  osd24   37.d881c8e      rb.0.1676178.2ae8944a.000000007ffc
> write
> 582368  osd27   37.d054a135     rb.0.1676178.2ae8944a.00000000842c
> write
> 582369  osd27   37.d054a135     rb.0.1676178.2ae8944a.00000000842c
> write
> 582370  osd18   37.876fd4d7     rb.0.1676178.2ae8944a.000000008430
> write
> 582371  osd19   37.e08b5547     rb.0.1676178.2ae8944a.000000008431
> write
> 582372  osd26   37.cd501c       rb.0.1676178.2ae8944a.00000000843e
> write
> 582373  osd26   37.cd501c       rb.0.1676178.2ae8944a.00000000843e
> write
> 582374  osd2    37.c5381d37     rb.0.1676178.2ae8944a.000000015ff5
> write
> 582375  osd2    37.c5381d37     rb.0.1676178.2ae8944a.000000015ff5
> write
> 582376  osd17   37.82f3d395     rb.0.1676178.2ae8944a.000000016173
> write
> 582377  osd19   37.9af0f044     rb.0.1676178.2ae8944a.000000016389
> write
> 582378  osd29   37.6d121a86     rb.0.1676178.2ae8944a.00000002004a
> write
> 582379  osd2    37.ee45629b     rb.0.1689d69.238e1f29.00000000484e
> write
> 582380  osd28   37.b24e1c3a     rb.0.1689d69.238e1f29.000000005ffd
> write
> 582381  osd28   37.b24e1c3a     rb.0.1689d69.238e1f29.000000005ffd
> write
> 582382  osd27   37.56c53271     rb.0.1689d69.238e1f29.00000000618d
> write
> 582383  osd4    37.a5d8fbaf     rb.0.1689d69.238e1f29.000000007ffc
> write
> 582384  osd4    37.a5d8fbaf     rb.0.1689d69.238e1f29.000000007ffc
> write
> 582385  osd4    37.a5d8fbaf     rb.0.1689d69.238e1f29.000000007ffc
> write
> 582386  osd1    37.947c5bcc     rb.0.1689d69.238e1f29.0000000080bd
> write
> 582387  osd26   37.7ae1d612     rb.0.1689d69.238e1f29.000000008112
> write
> 582388  osd18   37.9bf4cf3c     rb.0.1689d69.238e1f29.0000000081eb
> write
> 582389  osd1    37.9d5cf7f0     rb.0.1689d69.238e1f29.0000000081ec
> write
> 582390  osd19   37.4f750fee     rb.0.1689d69.238e1f29.0000000081f0
> write
> 582391  osd19   37.3db853ad     rb.0.1689d69.238e1f29.000000009ffb
> write
> 582392  osd19   37.3db853ad     rb.0.1689d69.238e1f29.000000009ffb
> write
> 582393  osd26   37.94b385c      rb.0.1689d69.238e1f29.00000000a1d1
> write
> 582394  osd28   37.fd40607a     rb.0.1689d69.238e1f29.000000022197
> write
> 582395  osd15   37.1011005e     rb.0.1689d69.238e1f29.000000034614
> write
> 582396  osd27   37.c42eab1      rb.0.1689d69.238e1f29.000000036160
> write
> 582397  osd2    37.2ac07662     rb.0.1689d69.238e1f29.00000001fffc
> write
> 582398  osd24   37.80a8ba78     rb.0.167614e.2ae8944a.000000026149
> write
> 582399  osd18   37.5c0f9c25     rb.0.167614e.2ae8944a.00000002a133
> write
> 582400  osd2    37.89e0f5b3     rb.0.167614e.2ae8944a.00000002ffe8
> write
> 582401  osd26   37.9be1215c     rb.0.167614e.2ae8944a.00000000042b
> write
> 582402  osd23   37.a64e7d0f     rb.0.167614e.2ae8944a.00000000238f
> write
> 582403  osd20   37.da28f8ab     rb.0.167614e.2ae8944a.000000003ffe
> write
> 582404  osd24   37.b7af1f40     rb.0.167614e.2ae8944a.000000004005
> write
> 582405  osd20   37.cecda9b4     rb.0.167614e.2ae8944a.000000004b8a
> write
> 582406  osd19   37.29bcdcae     rb.0.167614e.2ae8944a.000000005ffd
> write
> 582407  osd19   37.29bcdcae     rb.0.167614e.2ae8944a.000000005ffd
> write
> 582408  osd27   37.f29b1b01     rb.0.167614e.2ae8944a.000000006242
> write
> 582409  osd18   37.d68a6565     rb.0.167614e.2ae8944a.000000007ffc
> write
> 582410  osd18   37.d68a6565     rb.0.167614e.2ae8944a.000000007ffc
> write
> 582411  osd24   37.87cc6a8e     rb.0.167614e.2ae8944a.0000000082f2
> write
> 582412  osd15   37.a91eb70d     rb.0.167614e.2ae8944a.000000009ffb
> write
> 582413  osd15   37.3b20fd05     rb.0.167614e.2ae8944a.00000000a03a
> write
> 582414  osd2    37.89e0f5b3     rb.0.167614e.2ae8944a.00000002ffe8
> write
> 582415  osd2    37.89e0f5b3     rb.0.167614e.2ae8944a.00000002ffe8
> write
> 582416  osd26   37.f0bb2a2a     rb.0.167614e.2ae8944a.0000000301cc
> write
> 582417  osd26   37.3b4ebe12     rb.0.167614e.2ae8944a.0000000301cd
> write
> 582418  osd24   37.346ed74e     rb.0.167614e.2ae8944a.000000031fe7
> write
> 582419  osd24   37.346ed74e     rb.0.167614e.2ae8944a.000000031fe7
> write
> 582420  osd27   37.3f46e175     rb.0.167614e.2ae8944a.00000003215c
> write
> 582421  osd18   37.796dd926     rb.0.167614e.2ae8944a.00000003217a
> write
> 582422  osd19   37.4287ec84     rb.0.167614e.2ae8944a.000000033fe6
> write
> 582423  osd19   37.4287ec84     rb.0.167614e.2ae8944a.000000033fe6
> write
> 582424  osd26   37.84b7afe9     rb.0.167614e.2ae8944a.000000034112
> write
> 582425  osd24   37.f9a234b8     rb.0.167614e.2ae8944a.000000034113
> write
> 582426  osd18   37.6bd0b876     rb.0.167614e.2ae8944a.000000035fe5
> write
> 582427  osd18   37.6bd0b876     rb.0.167614e.2ae8944a.000000035fe5
> write
> 582428  osd23   37.1ae6123b     rb.0.167614e.2ae8944a.00000003611a
> write
> 582429  osd18   37.8202a597     rb.0.167614e.2ae8944a.00000003611b
> write
> 582430  osd17   37.4d252e13     rb.0.167614e.2ae8944a.000000037fe4
> write
> 582431  osd17   37.4d252e13     rb.0.167614e.2ae8944a.000000037fe4
> write
> 582432  osd4    37.7caa0b       rb.0.167614e.2ae8944a.00000003800a
> write
> 582433  osd4    37.5917feaf     rb.0.167614e.2ae8944a.0000000380cd
> write
> 582434  osd29   37.abe267c6     rb.0.167614e.2ae8944a.0000000380cf
> write
> 582435  osd16   37.dce75fe7     rb.0.167614e.2ae8944a.000000039fe3
> write
> 582436  osd16   37.dce75fe7     rb.0.167614e.2ae8944a.000000039fe3
> write
> 582437  osd1    37.f04f6991     rb.0.167614e.2ae8944a.00000003a02f
> write
> 582438  osd1    37.f04f6991     rb.0.167614e.2ae8944a.00000003a02f
> write
> 582439  osd27   37.33f8b641     rb.0.167614e.2ae8944a.00000003cc91
> write
> 582440  osd22   37.be13ad3d     rb.0.167614e.2ae8944a.00000003e40d
> write
> 582441  osd18   37.bbf96366     rb.0.167614e.2ae8944a.000000020035
> write
> 582442  osd18   37.bbf96366     rb.0.167614e.2ae8944a.000000020035
> write
> 582443  osd1    37.3edba98c     rb.0.165171e.238e1f29.000000039fe3
> write
> 582444  osd1    37.3edba98c     rb.0.165171e.238e1f29.000000039fe3
> write
> 582445  osd1    37.3edba98c     rb.0.165171e.238e1f29.000000039fe3
> write
> 582446  osd16   37.4800911f     rb.0.165171e.238e1f29.00000003a085
> write
> 582447  osd2    37.bdec20a3     rb.0.165171e.238e1f29.00000003a12c
> write
> 582448  osd23   37.33d51508     rb.0.165171e.238e1f29.00000003a12d
> write
> 582449  osd16   37.8a8ec527     rb.0.165171e.238e1f29.00000003a138
> write
> 582450  osd17   37.d7435c13     rb.0.165171e.238e1f29.00000003a139
> write
> 582451  osd17   37.d7435c13     rb.0.165171e.238e1f29.00000003a139
> write
> 582452  osd15   37.7e4b4f9e     rb.0.165171e.238e1f29.00000003a13a
> write
> 582453  osd26   37.ae80f3ac     rb.0.165171e.238e1f29.00000003a13b
> write
> 582454  osd26   37.ae80f3ac     rb.0.165171e.238e1f29.00000003a13b
> write
> 582455  osd15   37.fc3cadcd     rb.0.165171e.238e1f29.00000003a13c
> write
> 582456  osd15   37.fc3cadcd     rb.0.165171e.238e1f29.00000003a13c
> write
> 582457  osd19   37.dcc1f244     rb.0.165171e.238e1f29.00000003a13d
> write
> 582458  osd28   37.c5ce907a     rb.0.165171e.238e1f29.00000003a13e
> write
> 582459  osd28   37.c5ce907a     rb.0.165171e.238e1f29.00000003a13e
> write
> 582460  osd18   37.d7371b26     rb.0.165171e.238e1f29.00000003a13f
> write
> 582461  osd18   37.89ec9be5     rb.0.165171e.238e1f29.00000003a140
> write
> 582462  osd4    37.5032c82f     rb.0.165171e.238e1f29.00000003bfe2
> write
> 582463  osd4    37.5032c82f     rb.0.165171e.238e1f29.00000003bfe2
> write
> 582464  osd26   37.54a4fd50     rb.0.165171e.238e1f29.00000003bfe9
> write
> 582465  osd23   37.2929897b     rb.0.165171e.238e1f29.00000003c136
> write
> 582466  osd23   37.2929897b     rb.0.165171e.238e1f29.00000003c136
> write
> 582467  osd20   37.b9aff419     rb.0.165171e.238e1f29.00000003dfe1
> write
> 582468  osd20   37.b9aff419     rb.0.165171e.238e1f29.00000003dfe1
> write
> 582469  osd24   37.685a8638     rb.0.165171e.238e1f29.00000003e08c
> write
> 582470  osd26   37.adfd8b12     rb.0.165171e.238e1f29.00000003e14a
> write
> 582471  osd2    37.a67386a2     rb.0.165171e.238e1f29.00000003e14b
> write
> 582472  osd18   37.688ac754     rb.0.165171e.238e1f29.00000002802c
> write
> 582473  osd15   37.d2bed74d     rb.0.1676160.238e1f29.00000002000d
> write
> 582474  osd23   37.9c9a1a8f     rb.0.165171e.238e1f29.000000020002
> write

Have you looked at Ceph servers?  krbd is really just a client, so if
OSDs don't reply to its requests it can't do much.  From a quick look
this doesn't look like a krbd bug.

> epoch 39226
> flags
> pg_pool 0 pg_num 256 / 255
> pg_pool 1 pg_num 128 / 127
> pg_pool 4 pg_num 32 / 31
> pg_pool 19 pg_num 512 / 511
> pg_pool 25 pg_num 8 / 7
> pg_pool 27 pg_num 1 / 0
> pg_pool 28 pg_num 1 / 0
> pg_pool 29 pg_num 1 / 0
> pg_pool 30 pg_num 1 / 0
> pg_pool 31 pg_num 1 / 0
> pg_pool 32 pg_num 1 / 0
> pg_pool 33 pg_num 1 / 0
> pg_pool 34 pg_num 1 / 0
> pg_pool 35 pg_num 2 / 1
> pg_pool 36 pg_num 1 / 0
> pg_pool 37 pg_num 64 / 63
> pg_pool 40 pg_num 2 / 1
> pg_pool 41 pg_num 1 / 0
>         osd0    10.210.33.22:6815       100%    (exists, up)
>         osd1    10.210.33.22:6800       100%    (exists, up)
>         osd2    10.210.33.22:6805       100%    (exists, up)
>         osd3    10.210.32.22:6800         0%    (doesn't exist)
>         osd4    10.210.33.22:6810       100%    (exists, up)
>         osd5    10.210.33.22:6820       100%    (doesn't exist)
>         osd6    10.210.33.22:6805       100%    (doesn't exist)
>         osd7    10.210.33.22:6825       100%    (doesn't exist)
>         osd8    10.210.33.22:6830       100%    (doesn't exist)
>         osd9    10.210.33.22:6835       100%    (doesn't exist)
>         osd10   10.210.33.22:6805       100%    (doesn't exist)
>         osd11   10.210.33.22:6845       100%    (doesn't exist)
>         osd12   10.210.33.22:6850       100%    (doesn't exist)
>         osd13   10.210.33.22:6855       100%    (doesn't exist)
>         osd14   10.210.33.22:6860       100%    (doesn't exist)
>         osd15   10.210.32.23:6800       100%    (exists, up)
>         osd16   10.210.32.23:6807       100%    (exists, up)
>         osd17   10.210.32.23:6801       100%    (exists, up)
>         osd18   10.210.32.23:6816       100%    (exists, up)
>         osd19   10.210.32.23:6812       100%    (exists, up)
>         osd20   10.210.34.21:6800       100%    (exists, up)
>         osd21   10.210.34.21:6804       100%    (exists, up)
>         osd22   10.210.34.21:6809       100%    (exists, up)
>         osd23   10.210.34.21:6814       100%    (exists, up)
>         osd24   10.210.34.21:6819       100%    (exists, up)
>         osd25   10.210.33.21:6800       100%    (exists, up)
>         osd26   10.210.33.21:6805       100%    (exists, up)
>         osd27   10.210.33.21:6810       100%    (exists, up)
>         osd28   10.210.33.21:6815       100%    (exists, up)
>         osd29   10.210.33.21:6820       100%    (exists, up)
>         osd30   10.210.33.22:6865       100%    (doesn't exist)
>         osd31   10.210.33.22:6870         0%    (doesn't exist)
>         osd32   10.210.33.21:6800         0%    (doesn't exist)
>
> I don't know how to interpret this, the doesn't exist lines are correct,
> these osds where removed.
> Why are they still known to the rbd client? The OSDs where removed before
> the client was booted.

What procedure did you follow to remove those OSDs?

Thanks,

                Ilya


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux