Re: Troubleshooting hanging storage backend whenever there is any cluster change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, in our `ceph.conf` we have:

  mon_max_pg_per_osd = 300

While the host is offline (9 OSDs down):

  4352 PGs * 3 / 62 OSDs ~ 210 PGs per OSD

If all OSDs are online:

  4352 PGs * 3 / 71 OSDs ~ 183 PGs per OSD

... so this doesn't seem to be the issue.

If I understood you right, that's what you've meant. If I got you wrong,
would you mind to point to one of those threads you mentioned?

Thanks :)

Am 12.10.2018 um 14:03 schrieb Burkhard Linke:
> Hi,
> 
> 
> On 10/12/2018 01:55 PM, Nils Fahldieck - Profihost AG wrote:
>> I rebooted a Ceph host and logged `ceph status` & `ceph health detail`
>> every 5 seconds. During this I encountered 'PG_AVAILABILITY Reduced data
>> availability: pgs peering'. At the same time some VMs hung as described
>> before.
> 
> Just a wild guess... you have 71 OSDs and about 4500 PG with size=3.
> 13500 PG instance overall, resulting in ~190 PGs per OSD under normal
> circumstances.
> 
> If one host is down and the PGs have to re-peer, you might reach the
> limit of 200 PG/OSDs on some of the OSDs, resulting in stuck peering.
> 
> You can try to raise this limit. There are several threads on the
> mailing list about this.
> 
> Regards,
> Burkhard
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux