Re: All OSD fails after few requests to RGW

Piotr Nowosielski <piotr.nowosielski@xxxxxxxxxxxxxxxx> · Wed, 10 May 2017 09:36:40 +0200

You can:
- change these parameters and use ceph-objectstore-tool
- add OSD host - rebuild the cluster will reduce the number of files in the
directories
- wait until "split" operations are over ;-)

In our case, we could afford to wait until the "split" operation is over (we
have 2 clusters in slightly different configurations storing the same data)

hint:
When creating a new pool, use the parameter "expected_num_objects"
https://www.suse.com/documentation/ses-4/book_storage_admin/data/ceph_pools_operate.html

Piotr Nowosielski
Senior Systems Engineer
Zespół Infrastruktury 5
Grupa Allegro sp. z o.o.
Tel: +48 512 08 55 92

-----Original Message-----
From: Anton Dmitriev [mailto:tech@xxxxxxxxxx]
Sent: Wednesday, May 10, 2017 9:19 AM
To: Piotr Nowosielski <piotr.nowosielski@xxxxxxxxxxxxxxxx>;
ceph-users@xxxxxxxxxxxxxx
Subject: Re:  All OSD fails after few requests to RGW

How did you solved it? Set new split/merge thresholds, and manually applied
it by ceph-objectstore-tool --data-path
/var/lib/ceph/osd/ceph-${osd_num} --journal-path
/var/lib/ceph/osd/ceph-${osd_num}/journal
--log-file=/var/log/ceph/objectstore_tool.${osd_num}.log --op
apply-layout-settings --pool default.rgw.buckets.data

on each OSD?

How I can see in logs, that split occurs?

On 10.05.2017 10:13, Piotr Nowosielski wrote:
> Hey,
> We had similar problems. Look for information on "Filestore merge and
> split".
>
> Some explain:
> The OSD, after reaching a certain number of files in the directory (it
> depends of 'filestore merge threshold' and 'filestore split multiple'
> parameters) rebuilds the structure of this directory.
> If the files arrives, the OSD creates new subdirectories and moves
> some of the files there.
> If the files are missing the OSD will reduce the number of subdirectories.
>
>
> --
> Piotr Nowosielski
> Senior Systems Engineer
> Zespół Infrastruktury 5
> Grupa Allegro sp. z o.o.
> Tel: +48 512 08 55 92
>
> Grupa Allegro Sp. z o.o. z siedzibą w Poznaniu, 60-166 Poznań, przy ul.
> Grunwaldzka 182, wpisana do rejestru przedsiębiorców prowadzonego
> przez Sąd Rejonowy Poznań - Nowe Miasto i Wilda, Wydział VIII
> Gospodarczy Krajowego Rejestru Sądowego pod numerem KRS 0000268796, o
> kapitale zakładowym w wysokości 33 976 500,00 zł, posiadająca numer
> identyfikacji podatkowej NIP: 5272525995.
>
>
>
> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
> Of Anton Dmitriev
> Sent: Wednesday, May 10, 2017 8:14 AM
> To: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  All OSD fails after few requests to RGW
>
> Hi!
>
> I increased pg_num and pgp_num for pool default.rgw.buckets.data from
> 2048 to 4096, and it seems that situation became a bit better,
> cluster dies after 20-30 PUTs, not after 1. Could someone please give
> me some recommendations how to rescue the cluster?
>
> On 27.04.2017 09:59, Anton Dmitriev wrote:
>> Cluster was going well for a long time, but on the previous week osds
>> start to fail.
>> We use cluster like image storage for Opennebula with small load and
>> like object storage with high load.
>> Sometimes disks of some osds utlized by 100 %, iostat shows avgqu-sz
>> over 1000, while reading or writing a few kilobytes in a second, osds
>> on this disks become unresponsive and cluster marks them down. We
>> lower the load to object storage and situation became better.
>>
>> Yesterday situation became worse:
>> If RGWs are disabled and there is no requests to object storage
>> cluster performing well, but if enable RGWs and make a few PUTs or
>> GETs all not SSD osds on all storages become in the same situation,
>> described above.
>> IOtop shows, that xfsaild/<disk> burns disks.
>>
>> trace-cmd record -e xfs\*  for a 10 seconds shows 10 milion objects,
>> as i understand it means ~360 000 objects to push per one osd for a
>> 10 seconds
>>     $ wc -l t.t
>> 10256873 t.t
>>
>> fragmentation on one of such disks is about 3%
>>
>> more information about cluster:
>>
>> https://yadi.sk/d/Y63mXQhl3HPvwt
>>
>> also debug logs for osd.33 while problem occurs
>>
>> https://yadi.sk/d/kiqsMF9L3HPvte
>>
>> debug_osd = 20/20
>> debug_filestore = 20/20
>> debug_tp = 20/20
>>
>>
>>
>> Ubuntu 14.04
>> $ uname -a
>> Linux storage01 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29
>> 20:22:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
>>
>> Ceph 10.2.7
>>
>> 7 storages: Supermicro 28 osd 4tb 7200 JBOD + journal raid10 4 ssd
>> intel 3510 800gb + 2 osd SSD intel 3710 400gb for rgw meta and index
>> One of this storages differs only in number of osd, it has 26 osd on
>> 4tb, instead of 28 on others
>>
>> Storages connect to each other by bonded 2x10gbit Clients connect to
>> storages by bonded 2x1gbit
>>
>> in 5 storages 2 x CPU E5-2650v2  and 256 gb RAM in 2 storages 2 x CPU
>> E5-2690v3  and 512 gb RAM
>>
>> 7 mons
>> 3 rgw
>>
>> Help me please to rescue the cluster.
>>
>>
>
> --
> Dmitriev Anton
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Dmitriev Anton
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com