OSD log being spammed with BlueStore stupidallocator dump

Wido den Hollander <wido@xxxxxxxx> · Thu, 11 Oct 2018 00:08:15 +0200

Hi,

On a Luminous cluster running a mix of 12.2.4, 12.2.5 and 12.2.8 I'm
seeing OSDs writing heavily to their logfiles spitting out these lines:

2018-10-10 21:52:04.019037 7f90c2f0f700  0 stupidalloc 0x0x55828ae047d0
dump  0x15cd2078000~34000
2018-10-10 21:52:04.019038 7f90c2f0f700  0 stupidalloc 0x0x55828ae047d0
dump  0x15cd22cc000~24000
2018-10-10 21:52:04.019038 7f90c2f0f700  0 stupidalloc 0x0x55828ae047d0
dump  0x15cd2300000~20000
2018-10-10 21:52:04.019039 7f90c2f0f700  0 stupidalloc 0x0x55828ae047d0
dump  0x15cd2324000~24000
2018-10-10 21:52:04.019040 7f90c2f0f700  0 stupidalloc 0x0x55828ae047d0
dump  0x15cd26c0000~24000
2018-10-10 21:52:04.019041 7f90c2f0f700  0 stupidalloc 0x0x55828ae047d0
dump  0x15cd2704000~30000

It goes so fast that the OS-disk in this case can't keep up and become
100% util.

This causes the OSD to slow down and cause slow requests and starts to flap.

It seems that this is *only* happening on OSDs which are the fullest
(~85%) on this cluster and they have about ~400 PGs each (Yes, I know,
that's high).

Looking at StupidAllocator.cc I see this piece of code:

void StupidAllocator::dump()
{
  std::lock_guard<std::mutex> l(lock);
  for (unsigned bin = 0; bin < free.size(); ++bin) {
    ldout(cct, 0) << __func__ << " free bin " << bin << ": "
                  << free[bin].num_intervals() << " extents" << dendl;
    for (auto p = free[bin].begin();
         p != free[bin].end();
         ++p) {
      ldout(cct, 0) << __func__ << "  0x" << std::hex << p.get_start()
<< "~"
                    << p.get_len() << std::dec << dendl;
    }
  }
}

I'm just wondering why it would spit out these lines and what's causing it.

Has anybody seen this before?

Wido
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com