NOTE: rgw_max_chunk_size must be equal to rgw_obj_stripe_size, so I mean
both when refer to one.
For example when I changed rgw_obj_stripe_size from 4M to 16M OSD memory
usage increased approx 2.5 times.
This issue was reproduced with erasure-coded pools.
OSD command dump_mempools show that only anon pool bytes increased.
Further investigations show that whole buffer::raw object received from
network
(created in alloc_aligned_buffer() in AsyncConnection.cc:623)
The whole 4M or 16M buffer::raw objects preserved with nref>0 in
PrimaryLogPG::object_contexts
in ObjectContext::attr_cache.
This issue was reproduced on both luminous and master branches.
I see at least two types of improvement:
1) memcpy relatively small parts of buffer::raw when create new buffer::ptr
For just example with next compile-time configuration parameters:
BUFFER_MIN_SIZE_COPY_FROM = 64k
BUFFER_MAX_SIZE_TO_COPY = 16k
BUFFER_MIN_RATIO_TO_COPY = 128
will copy up to 512 bytes from 64k raw object
or will copy up to 16k from 4M object
will not copy from 63k raw object
Pros: will improve all issues of this type (preservation of
buffer::raw objects)
Cons: unknown impact, memory fragmentation for example
2) Improvements related particularly to PrimaryLogPG::object_contexts
2.1) Set osd_pg_object_context_cache_count into 1 or 0
Cons: cache will not actually work
2.2) Recreate bufferlists of attr_cache entries during inserting into
cache to copy attrs and free huge buffer later.
Pros: minimal impact on any other subsystems
Cons: will improve only this particular case
2.3) Limit object_contexts with total used memory also in addition to
osd_pg_object_context_cache_count.
Cons: cache will probably will not work because each entry will
occupy lot of memory and all entries will be skipped.
2.4) Remove object_contexts completely, create contexts every time on
fly.
Cons: object_contexts not looks like spare part that can be safely
removed
We tested osd_pg_object_context_cache_count=1 as hotfix
and it improved OSD memory usage significantly without dependency from
rgw_obj_stripe_size.
Can, please, somebody clarify a little bit about purpose of
PrimaryLogPG::object_contexts.
And, maybe suggest something about fixing this issue.
--
Best regards,
Aleksei Gutikov
Software Engineer | synesis.ru | Minsk. BY
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html