Re: block.db/block.wal device performance dropped after upgrade to 14.2.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In my case I only have 16GB RAM per node with 5 OSD on each of them, so I
actually have to tune osd_memory_target=2147483648 because with the default
value of 4GB my osd processes tend to get killed by OOM.
That is what I was looking into before the correct solution. I
disabled osd_memory_target limitation essentially setting it to default 4GB
- it helped in a sense that workload on the block.db device significantly
dropped, but overall pattern was not the same - for example there still
were no merges on the block.db device. It all came back to the usual
pattern with bluefs_buffered_io=true.
osd_memory_target limitation was implemented somewhere around 10 > 12
release upgrade I think, before memory auto scaling feature for bluestore
was introduced - that's when my osds started to get OOM. They worked fine
before that.

чт, 6 авг. 2020 г. в 20:28, Mark Nelson <mnelson@xxxxxxxxxx>:

> Yeah, there are cases where enabling it will improve performance as
> rocksdb can then used the page cache as a (potentially large) secondary
> cache beyond the block cache and avoid hitting the underlying devices
> for reads.  Do you have a lot of spare memory for page cache on your OSD
> nodes? You may be able to improve the situation with
> bluefs_buffered_io=false by increasing the osd_memory_target which
> should give the rocksdb block cache more memory to work with directly.
> One downside is that we currently double cache onodes in both the
> rocksdb cache and bluestore onode cache which hurts us when memory
> limited.  We have some experimental work that might help in this area by
> better balancing bluestore onode and rocksdb block caches but it needs
> to be rebased after Adam's column family sharding work.
>
> The reason we had to disable bluefs_buffered_io again was that we had
> users with certain RGW workloads where the kernel started swapping large
> amounts of memory on the OSD nodes despite seemingly have free memory
> available.  This caused huge latency spikes and IO slowdowns (even
> stalls).  We never noticed it in our QA test suites and it doesn't
> appear to happen with RBD workloads as far as I can tell, but when it
> does happen it's really painful.
>
>
> Mark
>
>
> On 8/6/20 6:53 AM, Manuel Lausch wrote:
> > Hi,
> >
> > I found the reasen of this behavior change.
> > With 14.2.10 the default value of "bluefs_buffered_io" was changed from
> > true to false.
> > https://tracker.ceph.com/issues/44818
> >
> > configureing this to true my problems seems to be solved.
> >
> > Regards
> > Manuel
> >
> > On Wed, 5 Aug 2020 13:30:45 +0200
> > Manuel Lausch <manuel.lausch@xxxxxxxx> wrote:
> >
> >> Hello Vladimir,
> >>
> >> I just tested this with a single node testcluster with 60 HDDs (3 of
> >> them with bluestore without separate wal and db).
> >>
> >> With the 14.2.10, I see on the bluestore OSDs a lot of read IOPs while
> >> snaptrimming. With 14.2.9 this was not an issue.
> >>
> >> I wonder if this would explain the huge amount of slowops on my big
> >> testcluster (44 Nodes 1056 OSDs) while snaptrimming. I
> >> cannot test a downgrade there, because there are no packages of older
> >> releases for CentOS 8 available.
> >>
> >> Regards
> >> Manuel
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux