>> Hi Sage, >> Recently I tracked some single op, and found that the bufferlist::rebuild_aligned will cost some time that a litte longer. >> The whole subop spend 7ms,and the pwritev spend 1.2ms, but the >> rebuild_aligned cost 1.1ms.the test’s object length is 1M, Using >> rados put. I think the bufferlist rebuild and align should not cost so much time. The os environment have vm.swappiness = 0 and have tcmalloc. Is this normal? >> >> 2016-10-14 11:50:48.131103 queue_transactions existing >> 0x7f8900c302c0 >> osr(134.62s3 0x7f88ff458160) >> 2016-10-14 11:50:48.132477 queue_transactions (writeahead) 3234 >> [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)] >>>> cost 1.1ms FileJournal::prepare_entry => bufferlist:: >>>> rebuild_aligned > >why do you think 1.1ms is cost from rebuild... there exists a lot of things. rebuild of course is a problem we need to resolve. but it won't be problem in hdd case since it's just memory copy. i printed the timestamp on the path, and found the most time of 1.1ms spend in bufferlist::rebuild_aligned... the 1M data of pwritev is write to page cache not on disk, maybe it let bufferlist aligned seem somewhat expensive... >> 2016-10-14 11:50:48.133583 _journaled_ahead 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) [Transaction(0x7f8907736000),Transaction(0x7f89077361c0)] >> 2016-10-14 11:50:48.133620 queue_op 0x7f88ff586ba0 seq 3234 osr(134.62s3 0x7f88ff458160) 1051330 bytes (queue has 1 ops and 1051330 bytes) >> 2016-10-14 11:50:48.133630 queueing ondisk 0x7f8903ab77c0 >> 2016-10-14 11:50:48.133654 _do_op 0x7f88ff586ba0 seq 3234 >> osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 start >> 2016-10-14 11:50:48.133669 _do_transaction on 0x7f8907736000 >> 2016-10-14 11:50:48.133685 remove >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 >> 2016-10-14 11:50:48.133754 lfn_unlink: clearing omap on >> 3#134:461e3a1c:::testfile_4M:head#298 in cid 134.62s3_head >> 2016-10-14 11:50:48.134526 remove >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#298 = 0 >> 2016-10-14 11:50:48.134557 _omap_setkeys >> 134.62s3_head/3#134:46000000::::head# >> 2016-10-14 11:50:48.134705 _omap_setkeys >> 134.62s3_head/3#134:46000000::::head# = 0 >> 2016-10-14 11:50:48.134716 _do_transaction on 0x7f89077361c0 >> 2016-10-14 11:50:48.134723 _collection_move_rename >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# >> 2016-10-14 11:50:48.134736 _set_replay_guard 3234.1.0 START >> 2016-10-14 11:50:48.135859 _set_replay_guard 3234.1.0 done >> 2016-10-14 11:50:48.136130 lfn_unlink: clearing omap on >> 3#134:461e3a1c:::testfile_4M:head# in cid 134.62s3_head >> 2016-10-14 11:50:48.136305 _close_replay_guard 3234.1.0 >> 2016-10-14 11:50:48.137023 _close_replay_guard 3234.1.0 done >> 2016-10-14 11:50:48.137039 _collection_move_rename >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head#29a from >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0 >> 2016-10-14 11:50:48.137057 write >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 >>>> cost 1.2ms pwritev >> 2016-10-14 11:50:48.138255 write >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# 0~1048576 = 1048576 >> 2016-10-14 11:50:48.138291 setattrs >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# >> 2016-10-14 11:50:48.138313 setattrs >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0 >> 2016-10-14 11:50:48.138330 fgetattrs 70 getting 'hinfo_key' >> 2016-10-14 11:50:48.138337 setattrs >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# >> 2016-10-14 11:50:48.138357 setattrs >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# = 0 >> 2016-10-14 11:50:48.138371 fgetattrs 70 getting '_' >> 2016-10-14 11:50:48.138376 fgetattrs 70 getting 'snapset' >> 2016-10-14 11:50:48.138381 fgetattrs 70 getting 'hinfo_key' >> 2016-10-14 11:50:48.138386 setattrs >> 134.62s3_head/3#134:461e3a1c:::testfile_4M:head# >> 2016-10-14 11:50:48.138409 _do_op 0x7f88ff586ba0 seq 3234 r = 0, >> finisher 0x7f8909398680 0 >> 2016-10-14 11:50:48.138420 _finish_op 0x7f88ff586ba0 seq 3234 >> osr(134.62s3 0x7f88ff458160)/0x7f88ff458160 lat 0.007306 >> >> Than you, best regards ------------------------------------------------------------------------------------------------------------------------------------- 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f