OSD memory consumption significantly increased with greater rgw_obj_stripe_size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




NOTE: rgw_max_chunk_size must be equal to rgw_obj_stripe_size, so I mean both when refer to one.

For example when I changed rgw_obj_stripe_size from 4M to 16M OSD memory usage increased approx 2.5 times.
This issue was reproduced with erasure-coded pools.

OSD command dump_mempools show that only anon pool bytes increased.

Further investigations show that whole buffer::raw object received from network
(created in alloc_aligned_buffer() in AsyncConnection.cc:623)
The whole 4M or 16M buffer::raw objects preserved with nref>0 in PrimaryLogPG::object_contexts
in ObjectContext::attr_cache.

This issue was reproduced on both luminous and master branches.


I see at least two types of improvement:

1) memcpy relatively small parts of buffer::raw when create new buffer::ptr
  For just example with next compile-time configuration parameters:
    BUFFER_MIN_SIZE_COPY_FROM = 64k
    BUFFER_MAX_SIZE_TO_COPY = 16k
    BUFFER_MIN_RATIO_TO_COPY = 128
  will copy up to 512 bytes from 64k raw object
  or will copy up to 16k from 4M object
  will not copy from 63k raw object

Pros: will improve all issues of this type (preservation of buffer::raw objects)
  Cons: unknown impact, memory fragmentation for example

2) Improvements related particularly to PrimaryLogPG::object_contexts

  2.1) Set osd_pg_object_context_cache_count into 1 or 0
    Cons: cache will not actually work

2.2) Recreate bufferlists of attr_cache entries during inserting into cache to copy attrs and free huge buffer later.
    Pros: minimal impact on any other subsystems
    Cons: will improve only this particular case

2.3) Limit object_contexts with total used memory also in addition to osd_pg_object_context_cache_count. Cons: cache will probably will not work because each entry will occupy lot of memory and all entries will be skipped.

2.4) Remove object_contexts completely, create contexts every time on fly. Cons: object_contexts not looks like spare part that can be safely removed


We tested osd_pg_object_context_cache_count=1 as hotfix
and it improved OSD memory usage significantly without dependency from rgw_obj_stripe_size.


Can, please, somebody clarify a little bit about purpose of PrimaryLogPG::object_contexts.
And, maybe suggest something about fixing this issue.


--

Best regards,
Aleksei Gutikov
Software Engineer | synesis.ru | Minsk. BY
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux