Do you see any performance difference between direct io and non direct io mode? If it¹s disabled, you don¹t need any buffer alignment. On 11/20/15, 5:29 AM, "ceph-devel-owner@xxxxxxxxxxxxxxx on behalf of Haomai Wang" <ceph-devel-owner@xxxxxxxxxxxxxxx on behalf of haomaiwang@xxxxxxxxx> wrote: >On Fri, Nov 20, 2015 at 9:08 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> On Fri, 20 Nov 2015, Haomai Wang wrote: >>> On Fri, Nov 20, 2015 at 7:41 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >>> > On Fri, 20 Nov 2015, changtao381 wrote: >>> >> Hi All, >>> >> >>> >> Thanks for you apply! >>> >> >>> >> If directioIO + async IO requirement that alignment, it shouldn't >>>aligned by PAGE for each journal entry. >>> >> For it may write many entries of journal once time >>> > >>> > We also want to avoid copying the data around in memory to change the >>> > alignment. The messenger takes care to read data off the wire into >>> > buffers with the correct alignment so that we can later use them for >>> > direct-io. >>> > >>> > If you're worried about the small io case, I think this is just a >>>matter >>> > of setting a threshold for small ios so that we don't bother with >>>all of >>> > the padding when the memory copy isn't that expensive. But... given >>>that >>> > we have a header *and* footer in the journal format and almost all >>>IOs are >>> > 4k multiples I think it'd save you a single 4k block at most. >>> > >>> > (Also, I thought we already did something like this, but perhaps >>>not!) >>> >>> Hmm, based on our recently test, the data from messenger is aligned. >>> But the encoded data(pglog, transaction) will make thing worse, like >>> PR(https://github.com/ceph/ceph/pull/6368) solved, we even will get 14 >>> ptr in the bufferlist which passed into filejournal before. So it make >>> we rebuild each time within filejournal thread. Like this >>> PR(https://github.com/ceph/ceph/pull/6484), we try to make it rebuild >>> not in filejournal thread which is single. >> >> buffer::list::rebuild_page_aligned() should only copy/rebuild ptrs that >> are unaligned, and leave aligned ones untouched. It looks like the >> journal code is already doing this? > >Yes or not, for example we have a bufferlist contains 2 ptrs, the >first is unaligned, the second is aligned. But the current impl will >ignore the second alignment fact. Look at the code: > > void buffer::list::rebuild_aligned_size_and_memory(unsigned align_size, > unsigned align_memory) > { >........ > list unaligned; > unsigned offset = 0; > do { > /*cout << " segment " << (void*)p->c_str() > << " offset " << ((unsigned long)p->c_str() & (align - 1)) > << " length " << p->length() << " " << (p->length() & >(align - 1)) > << " overall offset " << offset << " " << (offset & (align >- 1)) > << " not ok" << std::endl; > */ > offset += p->length(); > unaligned.push_back(*p); > _buffers.erase(p++); > } while (p != _buffers.end() && > (!p->is_aligned(align_memory) || > !p->is_n_align_sized(align_size) || > (offset % align_size))); >((((((((((((( it will check offset alignment, so won't continues after >meeting the first unalign ptr )))))))))))))) > > if (!(unaligned.is_contiguous() && >unaligned._buffers.front().is_aligned(align_memory))) { > ptr nb(buffer::create_aligned(unaligned._len, align_memory)); > unaligned.rebuild(nb); > _memcopy_count += unaligned._len; > } > _buffers.insert(p, unaligned._buffers.front()); > } > } > > >> >> sage > > > >-- >Best Regards, > >Wheat >-- >To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >the body of a message to majordomo@xxxxxxxxxxxxxxx >More majordomo info at http://vger.kernel.org/majordomo-info.html ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f