On 16/09/2014 08:59, Ma, Jianpeng wrote: >> -----Original Message----- >> From: Loic Dachary [mailto:loic@xxxxxxxxxxx] >> Sent: Tuesday, September 16, 2014 2:47 PM >> To: Ma, Jianpeng >> Cc: ceph-devel@xxxxxxxxxxxxxxx >> Subject: Re: [PATCH 2/3] ec: make use of added aligned buffers >> >> >> >> On 16/09/2014 02:08, Ma, Jianpeng wrote: >>>> -----Original Message----- >>>> From: Sage Weil [mailto:sweil@xxxxxxxxxx] >>>> Sent: Tuesday, September 16, 2014 8:02 AM >>>> To: Ma, Jianpeng >>>> Cc: Loic Dachary; Janne Grunau; ceph-devel@xxxxxxxxxxxxxxx >>>> Subject: RE: [PATCH 2/3] ec: make use of added aligned buffers >>>> >>>> On Mon, 15 Sep 2014, Ma, Jianpeng wrote: >>>>> If we modify bufferlist::c_str() to bufferlist::c_str(bool align). >>>>> If (align) >>>>> Posix_memalign(data, CEPH_PAGE_SIZE, len) Else >>>>> Origin code. >>>> >>>> Alignment isn't really a bool, it's an int. c_str(int align=1) ? >>>> >>> I mean if we need a align memory after bufferlist::c_str. We can set >>> the align = true; Now bufferlist::c_str depend on the size when using rebuild if >> bufferlist have more prt. >>> >>> BTW, we also can change the rebuild() && rebuild(ptr). Merge two func into >> one rebuild(bool align). >>> By judge the parameter align to alloc align memory or not. >>> >>> Or am I missing something? >> >> I don't think there is a need for c_str(int align). We make every effort to allocate >> buffers that are properly aligned. If c_str() does not return an aligned buffer, >> the proper fix is to align the allocated buffer at the source, not to allocate a >> new aligned buffer and copy the content of the non aligned buffer into it. >> > >> Do you see a reason why that would be a bad way to deal with alignment ? > We only guarantee memory align after c_str and don't depend on the previous steps. This is a benefit indeed: less room for error in general. > If encode[i] have more ptr suppose they all aligned memory. > But how to guarantee aligned memory after c_str. > If bufferlist have more ptr, the aligned memory depend on the size of bufferlist. The ErasureCode::encode_prepare prepares the allocated buffer so that *) it holds enough room to contain all chunks *) c_str() on a chunk boundary will return a pointer that does not require allocating memory The ErasureCodeJerasure::get_chunk_size function determines the size of the buffer allocated by encode_prepare and further guarantees that each chunk is: *) size aligned on 16 bytes *) memory aligned on 16 bytes so that SIMD instructions can use it without moving it In other words, if c_str() had to re-allocate the buffer because it is not aligned, it would mean that the allocation of the buffer is not done as it should. Cheers > Jianpeng >> >> Cheers >> >>> >>> Jianpeng >>>> sage >>>> >>>>> >>>>> I think this is simple and correctly code. >>>>> >>>>> Jianpeng >>>>> Thanks! >>>>> >>>>>> -----Original Message----- >>>>>> From: ceph-devel-owner@xxxxxxxxxxxxxxx >>>>>> [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Loic Dachary >>>>>> Sent: Tuesday, September 16, 2014 1:20 AM >>>>>> To: Janne Grunau; ceph-devel@xxxxxxxxxxxxxxx >>>>>> Subject: Re: [PATCH 2/3] ec: make use of added aligned buffers >>>>>> >>>>>> Hi Janne, >>>>>> >>>>>> See below: >>>>>> >>>>>> On 15/09/2014 17:55, Janne Grunau wrote: >>>>>>> Requiring page aligned buffers and realigning the input if >>>>>>> necessary creates measurable oberhead. >> ceph_erasure_code_benchmark >>>>>>> is ~30% faster with this change for technique=reed_sol_van,k=2,m=1. >>>>>>> >>>>>>> Also prevents a misaligned buffer when >>>>>>> bufferlist::c_str(bufferlist) has to allocate a new buffer to >>>>>>> provide continuous one. See bug #9408 >>>>>>> >>>>>>> Signed-off-by: Janne Grunau <j@xxxxxxxxxx> >>>>>>> --- >>>>>>> src/erasure-code/ErasureCode.cc | 46 >>>>>>> +++++++++++++++++++++++++---------------- >>>>>>> src/erasure-code/ErasureCode.h | 3 ++- >>>>>>> 2 files changed, 30 insertions(+), 19 deletions(-) >>>>>>> >>>>>>> diff --git a/src/erasure-code/ErasureCode.cc >>>>>>> b/src/erasure-code/ErasureCode.cc index 5953f49..078f60b 100644 >>>>>>> --- a/src/erasure-code/ErasureCode.cc >>>>>>> +++ b/src/erasure-code/ErasureCode.cc >>>>>>> @@ -54,22 +54,38 @@ int >>>>>> ErasureCode::minimum_to_decode_with_cost(const >>>>>>> set<int> &want_to_read, } >>>>>>> >>>>>>> int ErasureCode::encode_prepare(const bufferlist &raw, >>>>>>> - bufferlist *prepared) const >>>>>>> + map<int, bufferlist> &encoded) >>>>>> const >>>>>>> { >>>>>>> unsigned int k = get_data_chunk_count(); >>>>>>> unsigned int m = get_chunk_count() - k; >>>>>>> unsigned blocksize = get_chunk_size(raw.length()); >>>>>>> - unsigned padded_length = blocksize * k; >>>>>>> - *prepared = raw; >>>>>>> - if (padded_length - raw.length() > 0) { >>>>>>> - bufferptr pad(padded_length - raw.length()); >>>>>>> - pad.zero(); >>>>>>> - prepared->push_back(pad); >>>>>>> + unsigned pad_len = blocksize * k - raw.length(); >>>>>>> + >>>>>>> + bufferlist prepared = raw; >>>>>>> + >>>>>>> + if (!prepared.is_aligned()) { >>>>>>> + prepared.rebuild_aligned(); >>>>>>> + } >>>>>>> + >>>>>>> + for (unsigned int i = 0; i < k - !!pad_len; i++) { >>>>>>> + int chunk_index = chunk_mapping.size() > 0 ? chunk_mapping[i] : i; >>>>>>> + bufferlist &chunk = encoded[chunk_index]; >>>>>>> + chunk.substr_of(prepared, i * blocksize, blocksize); } >>>>>> >>>>>> It is possible for more than one chunk to be padding. It's a border >>>>>> case but... for instance with alignment = 16, k=12 and in of length >>>>>> 1550 you end up with two padding chunks because the blocksize is 144. >>>>>> >>>>>>> + if (pad_len > 0) { >>>>>>> + int chunk_index = chunk_mapping.size() > 0 ? chunk_mapping[k >>>>>>> + - 1] : k - >>>>>> 1; >>>>>>> + bufferlist &chunk = encoded[chunk_index]; >>>>>>> + bufferptr padded(buffer::create_aligned(blocksize)); >>>>>>> + raw.copy((k - 1) * blocksize, blocksize - pad_len, padded.c_str()); >>>>>>> + padded.zero(blocksize - pad_len, pad_len); >>>>>>> + chunk.push_back(padded); >>>>>>> } >>>>>>> - unsigned coding_length = blocksize * m; >>>>>>> - bufferptr coding(buffer::create_page_aligned(coding_length)); >>>>>>> - prepared->push_back(coding); >>>>>>> - prepared->rebuild_page_aligned(); >>>>>>> + for (unsigned int i = k; i < k + m; i++) { >>>>>>> + int chunk_index = chunk_mapping.size() > 0 ? chunk_mapping[i] : i; >>>>>>> + bufferlist &chunk = encoded[chunk_index]; >>>>>>> + chunk.push_back(buffer::create_aligned(blocksize)); >>>>>>> + } >>>>>>> + >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> @@ -80,15 +96,9 @@ int ErasureCode::encode(const set<int> >>>>>> &want_to_encode, >>>>>>> unsigned int k = get_data_chunk_count(); >>>>>>> unsigned int m = get_chunk_count() - k; >>>>>>> bufferlist out; >>>>>>> - int err = encode_prepare(in, &out); >>>>>>> + int err = encode_prepare(in, *encoded); >>>>>>> if (err) >>>>>>> return err; >>>>>>> - unsigned blocksize = get_chunk_size(in.length()); >>>>>>> - for (unsigned int i = 0; i < k + m; i++) { >>>>>>> - int chunk_index = chunk_mapping.size() > 0 ? chunk_mapping[i] : i; >>>>>>> - bufferlist &chunk = (*encoded)[chunk_index]; >>>>>>> - chunk.substr_of(out, i * blocksize, blocksize); >>>>>>> - } >>>>>>> encode_chunks(want_to_encode, encoded); >>>>>>> for (unsigned int i = 0; i < k + m; i++) { >>>>>>> if (want_to_encode.count(i) == 0) diff --git >>>>>>> a/src/erasure-code/ErasureCode.h b/src/erasure-code/ErasureCode.h >>>>>>> index 7aaea95..62aa383 100644 >>>>>>> --- a/src/erasure-code/ErasureCode.h >>>>>>> +++ b/src/erasure-code/ErasureCode.h >>>>>>> @@ -46,7 +46,8 @@ namespace ceph { >>>>>>> const map<int, >>>> int> >>>>>> &available, >>>>>>> set<int> >>>> *minimum); >>>>>>> >>>>>>> - int encode_prepare(const bufferlist &raw, bufferlist *prepared) >>>> const; >>>>>>> + int encode_prepare(const bufferlist &raw, >>>>>>> + map<int, bufferlist> &encoded) const; >>>>>>> >>>>>>> virtual int encode(const set<int> &want_to_encode, >>>>>>> const bufferlist &in, >>>>>>> >>>>>> >>>>>> -- >>>>>> Lo?c Dachary, Artisan Logiciel Libre >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" >>>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo >>>>> info at http://vger.kernel.org/majordomo-info.html >>>>> >>>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" >>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo >>> info at http://vger.kernel.org/majordomo-info.html >>> >> >> -- >> Loïc Dachary, Artisan Logiciel Libre > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature