I think each write will create 2 objects (512 KB head object + rest of the contents) if your object size > 512KB. Also, it is writing some xattrs on top of what OSD is writing. Don't take my word blindly as I am not fully familiar with RGW :-) This will pollute significant number of INODE I guess.. But, I think the effect will be much more severe in case of RBD partial random write case. Thanks & Regards Somnath -----Original Message----- From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of flisky Sent: Wednesday, December 02, 2015 6:39 AM To: ceph-users@xxxxxxxxxxxxxx Subject: Re: does anyone know what xfsaild and kworker are?they make osd disk busy. produce 100-200iops per osd disk? Ignore my last reply. I read the thread [Re: XFS Syncd]("http://oss.sgi.com/archives/xfs/2015-06/msg00111.html"), and found that might be okay. The call xfs_ail_push is almost INODE rather than BUF (1579 vs 99). Our ceph is dedicated to S3 service, and the write is small. So, where are so many INODE changes come from? How can I decrease it? Thanks in advanced! ====================== Mount Options: rw,noatime,seclabel,swalloc,attr2,largeio,nobarrier,inode64,logbsize=256k,noquota ====================== XFS Info: meta-data=/dev/sdb1 isize=2048 agcount=4, agsize=182979519 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 finobt=0 data = bsize=4096 blocks=731918075, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=357381, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 On 2015年12月02日 16:20, flisky wrote: > It works. However, I think the root case is due to the xfs_buf missing? > > trace-cmd record -e xfs\* > trace-cmd report > xfs.txt > awk '{print $4}' xfs2.txt |sort -n |uniq -c|sort -n|tail -n 20 > > 14468 xfs_file_splice_write: > 16562 xfs_buf_find: > 19597 xfs_buf_read: > 19634 xfs_buf_get: > 21943 xfs_get_blocks_alloc: > 23265 xfs_perag_put: > 26327 xfs_perag_get: > 27853 xfs_ail_locked: > 39252 xfs_buf_iorequest: > 40187 xfs_ail_delete: > 41590 xfs_buf_ioerror: > 42523 xfs_buf_hold: > 44659 xfs_buf_trylock: > 47986 xfs_ail_flushing: > 50793 xfs_ilock_nowait: > 57585 xfs_ilock: > 58293 xfs_buf_unlock: > 79977 xfs_buf_iodone: > 104165 xfs_buf_rele: > 108383 xfs_iunlock: > > Could you please give me another hint? :) Thanks! > > On 2015年12月02日 05:14, Somnath Roy wrote: >> Sure..The following settings helped me minimizing the effect a bit >> for the PR https://github.com/ceph/ceph/pull/6670 >> >> >> sysctl -w fs.xfs.xfssyncd_centisecs=720000 >> sysctl -w fs.xfs.xfsbufd_centisecs=3000 >> sysctl -w fs.xfs.age_buffer_centisecs=720000 >> >> But, for existing Ceph write path you may need to tweak this.. >> >> Thanks & Regards >> Somnath >> >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf >> Of flisky >> Sent: Tuesday, December 01, 2015 11:04 AM >> To: ceph-users@xxxxxxxxxxxxxx >> Subject: Re: does anyone know what xfsaild and kworker are?they make osd disk busy. produce 100-200iops per osd disk? >> >> On 2015年12月02日 01:31, Somnath Roy wrote: >>> This is xfs metadata sync process...when it is waking up and there >>> are lot of data to sync it will throttle all the process accessing >>> the drive...There are some xfs settings to control the behavior, but >>> you can't stop that >> May I ask how to tune the xfs settings? Thanks! >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com